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Abstract 


SDD-1,  a  System  for  Distributed  Databases,  is  a 
distributed  database  system  being  developed  by  CCA.  SDD-1 
permits  data  to  be  stored  redundantly  at  several  database 
sites  in  order  to  enhance  the  reliability  and 
responsiveness  of  the  system  and  to  facilitate  upwards 
scaling  of  system  capacity.  This  paper  describes  the 
algorithm  used  by  SDD-1  for  updating  data  that  is  stored 
redundantly 
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1 .  Introduction 

SDD-1  is  a  prototype  distributed  database  system  currently 
being  designed  at  Computer  Corporation  of  America  [ROTHNIE 
and  GOODMAN].  The  system  will  use  the  data  storage 
facilities  of  Datacomputers  [MARILL  and  STERN]  that  are 
scattered  around  an  Arpanet  environment  [METCALF],  This 
report  describes  the  basic  approach  to  the  problem  of 
redundant  update  in  SDD-1 .  Descriptions  of  other  aspects 
of  SDD-1,  such  as  retrieval  and  reliability,  are  reported 
elsewhere  [ROTHNIE  and  GOODMAN],  [WONG],  [HAMMER  and 
SHIPMAN] . 

Several  solutions  have  recently  been  suggested  to  the 
concurrent  update  problem  in  a  distributed  database  system 
(see  discussion  in  [ROTHNIE  and  GOODMAN]).  The  techniques 
include  performing  all  updates  at  a  primary  site  [ALSBERG 
and  DAY],  or  using  a  voting  discipline  to  perform  an 
update  on  a  data  item  after  the  sites  that  hold  a  copy  of 
that  data  item  have  agreed  to  the  update  [THOMAS]. 
However,  these  methods  suffer  from  the  problem  either  of  a 
potential  bottleneck  on  updates  or  of  heavy  communication 


traffic . 
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The  approach  to  be  discussed  in  this  paper  attempts  to 
overcome  both  problems  by  preanalyzing  those  transactions 
I  that  will  be  run  frequently,  so  as  to  select  those 

transaction  types  that  can  be  run  using  little  or  even  no 
synchronization . 

The  preanalysis  technique  determines,  for  each  type  of 
transaction,  the  level  of  synchronization  required  for 

that  transaction  type.  The  analysis  is  based  on  knowledge 
of  which  portions  of  the  database  each  transaction  will 
read  or  write.  This  analysis  is  based  on  invariant 

properties  of  each  transaction  type  that  are  in  no  sense 
stochastic.  The  major  assumption  is  that  the  types  of 
transactions  that  account  for  most  of  the  database 
activity  are  predictable  in  the  sense  that  they  only 
operate  on  certain  restricted  portions  of  the  database. 

The  SDD-1  system  will  permit  data  to  be  stored  redundantly 
around  the  network  without  restricting  any  one  copy  of  a 

logical  data  item  to  be  the  primary  copy  for  updates.  The 

retrieval  algorithm  will  be  truly  distributed,  aggregating 
data  at  a  single  site  for  synchronization  purposes  only 
when  necessary  [WONG].  The  system  will  also  be  able  to 
run  in  spite  of  multiple  site  failures  and  will  be  able  to 
recover  when  down  sites  return  to  operation  [HAMMER  and 
SHIPMAN] . 
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In  this  paper  we  describe  the  formal  methods  used  to 
analyze  the  degree  of  synchronization  required  by 
transactions  in  SDD-1 .  While  we  believe  our  method  to  be 
quite  general,  the  discussion  will  be  limited  to  its 
application  in  the  SDD-1  environment. 

A  simplified  version  of  the  SDD-1  concurrent  update 
methodology  was  presented  in  [ROTHNIE  et  al]  and 

[BERNSTEIN  et  al].  We  expand  this  technique  more 

completely  in  Sections  2  and  3»  The  proof  of  correctness 
of  our  synchronization  rules  is  presented  in  Section 
In  Section  5,  a  further  mechanism  is  described  which 
extends  the  earlier  results. 
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2.  The  SDD-1  Architecture 


2 . 1  Overview 


An  SDD-1  database  system  consists  of  a  set  of  sites ,  each 
site  residing  at  a  single  node  of  the  network.  A  site 
provides  some  or  all  of  the  following  subsystems: 

1 .  data  module  -  maintains  a  stored  copy  of  portions 
of  the  logical  database  and  supervises  read  and  write 
operations  on  its  copy; 

2.  transaction  module  -  processes  transactions,  one 
at  a  time,  by  communicating  with  data  modules; 

3.  terminal  module  -  provides  a  user  interface  that 
routes  each  user  transaction  to  the  appropriate 
transaction  module  for  processing. 

From  the  user’s  viewpoint,  a  transaction  is  entered  at  a 
terminal  and  received  by  the  terminal  module  that  controls 
that  terminal.  The  terminal  module  examines  the 
transaction  and  decides  which  transaction  module  should 
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execute  it;  the  transaction  module  may  or  may  not  reside 
at  the  same  site  where  the  terminal  module  is  located 
(i.e.,  the  transaction  module  may  be  at  a  foreign  site) . 
The  terminal  module  may  pass  certain  synchronization 
information  to  the  transaction  module,  in  addition  to  the 
text  of  the  transaction,  to  synchronize  this  transaction 
with  other  transactions  that  ran  at  the  same  terminal. 

A  transaction  module  receives  transactions  from  many 
different  (possibly  foreign)  terminal  modules.  For  each 
transaction  it  receives,  a  transaction  module  interacts 
with  various  (possibly  foreign)  data  modules  to  obtain  the 
portion  of  the  database  necessary  for  processing  the  read 
and  write  operations  requested  by  the  transaction. 
Results  of  the  transaction  (e.g.  printed  output)  are 
passed  back  to  the-  terminal  module  that  sent  the 
transaction . 

A  data  module  is  a  database  management  facility  that 
processes  read  and  write  operations  from  (possibly 
foreign)  transaction  modules.  Certain  synchronization 
facilities  are  supported  by  the  data  module  so  that 
transactions  are  able  to  obtain  a  consistent  view  of  the 
database.  The  synchronization  facilities  supplied  by  a 
data  module  are  entirely  local  to  that  data  module  and  do 
not  require  that  the  data  module  ever  explicitly  cooperate 
(via  message  passing,  say)  with  other  data  modules. 


Overview  of  Logical  Architecture 
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The  three  kinds  of  modules  supported  by  SDD-1  constitute 
three  levels  of  virtual  machines  (see  figure  2.1).  At  the 
lowest  level  are  the  data  modules.  They  provide  a 
facility  for  processing  read  and  write  commands 
atomically.  At  the  second  level  are  transaction  modules. 
Transaction  modules  provide  a  facility  for  processing 
transactions  and  guarantee  that  the  union  of  all 
transactions  processed  by  an  SDD-1  system  is  "serially 
reproducible"  (this  concept,  discussed  in  [ROTHNIE  and 
GOODMAN],  will  be  developed  in  great  detail  in  the 
sequel).  At  the  third  level  are  terminal  modules. 
Terminal  modules  provide  a  user  interface  and  guarantee 
certain  consistency  conditions  among  transaction  run  at 
that  terminal  (in  addition  to  serial  reproducibility). 
While  we  will  not  discuss  the  particular  sof tware/hardware 
structure  that  will  be  used  to  implement  the  virtual 
machines,  one  can  think  of  the  three  types  of  modules 
being  implemented  as  software  processes,  with  each  data 
module  incorporating  a  Datacomputer  [MARILL  and  STERN]. 
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2.2  Distributed  Data  Organization 

A  logical  database  in  SDD-1  consists  of  a  set  of  relations 
[CODD].  Each  relation  has  one  domain  named  "tuple 
identifier"  (TID)  which  is  a  key  of  the  relation;  that 
is,  no  two  tuples  of  a  relation  can  have  identical  TID 
values . 

Each  relation  is  partitioned  into  a  set  of  logical 
fragments .  Logical  fragments  are  defined  by  first 
partitioning  the  set  of  all  possible  tuples  of  the 
relations  into  a  set  of  mutually  exclusive  partitions . 
For  example,  the  EMPLOYEE  relation  could  be  partitioned  by 
DEPARTMENT,  so  that  each  partition  contains  all  of  the 
employee  tuples  in  a  single  department.  A  logical 
fr agment  consists  of  a  projection  of  a  partition  on  the 
TID  domain  and  one  other  domain.  The  inclusion  of  the  TID 
domain  guarantees  that  the  logical  fragment  has  exactly 
one  tuple  for  each  tuple  of  the  partition  from  which  it 
was  selected. 

A  stored  copy  of  a  logical  fragment  is  called  a  stored 
fragment .  Stored  fragments  are  the  units  of  data 
distribution;  a  stored  fragment  is  either  entirely 
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present  or  entirely  absent  at  a  data  module.  Note  that 
several  stored  fragments  from  a  single  partition  of  a, 
relation  might  conveniently  be  stored  as  a  single  file  at 
a  data  module  so  that  the  TID  domain  need  not  be  repeated 
for  every  fragment. 

We  do  not  require  that  two  stored  copies  of  a  logical 
fragment  at  two  different  data  modules  be  identical  at  all 
times.  The  redundant  update  mechanism  will  be  responsible 
for  only  allowing  consistent  copies  to  be  read. 

Each  logical  fragment  is  partitioned  into  logical  data 
items,  a  stored  copy  of  which  is  called  a  stored  data 
item.  A  data  item  is  the  smallest  updatable  unit  in  the 
database . 

Each  logical  data  item  may  have  several  associated  stored 
data  items.  Hence,  when  referencing  a  logical  data  item, 
it  is  necessary  to  choose  a  particular  stored  data  item  to 
reference.  The  concept  of  materialization  is  convenient 
here.  Formally,  a  materialization  is  a  total  function 
from  the  set  of  logical  fragments  into  the  set  of  stored 
fragments.  That  is,  a  materialization  is  an  assignment  of 
a  stored  fragment  for  each  logical  fragment. 

Each  transaction  is  said  to  run  in  a  particular 
materialization  of  the  database.  The  materialization  of  a 
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transaction  specifies  which  copies  of  logical  fragments 
are  to  be  read .  In  order  to  maintain  the  internal 
consistency  of  all  stored  copies  of  a  particular  logical 
fragment,  a  transaction  must  perform  its  updates  on  all 
stored  copies  of  each  logical  data  item  (not  just  the  copy 
specified  by  the  materialization).  As  a  result, 
materializations  are  not  useful  when  considering  write 
operations.  The  process  of  updating  fragments  will  be 
described  later  in  some  detail. 


There  are  no  logical  restrictions  on  how  to  configure  a 
materialization,  other  than  that  each  logical  fragment 
must  map  into  a  stored  copy  of  that  same  fragment.  A 
materialization  need  not,  for  example,  obtain  any  of  its 
stored  fragments  from  the  site  at  which  it  executes. 
Also,  two  materializations  may  use  different  stored 
copies  of  a  single  logical  fragment.  Two  transactions 
concurrently  running  in  these  materializations  may 
therefore  read  different  stored  copies  of  a  single  logical 
fragment  concurrently.  The  system  as  a  whole  does  not 
support  a  single  primary  copy  of  a  logical  fragment  for 


all  materializations 


How  the  system  avoids  race 


conditions  in  such  an  apparently  chaotic  environment  is 
the  main  subject  of  this  report. 


SDD-1  Concurrency  Control  Mechanism 
The  SDD-1  Architecture 


Page  -11 
Section  2 


2.3  Transactions 

The  basic  unit  of  a  user  computation  in  SDD-1  is  the 
transaction .  Transactions  are  structured  to  execute  in 
three  sequential  steps: 

1.  The  transaction  reads  a  subset  of  the  database, 
called  its  read-set ,  into  a  workspace. 

2.  It  does  some  computation  on  the  workspace. 

3.  The  transaction  writes  some  of  the  values  in  its 
workspace  back  into  a  subset  of  the  database,  called 
its  write-set . 

The  read-set  and  write-set  of  a  transaction  are  defined  on 
the  logical  database.  That  is,  the  transaction  references 
only  logical  data  items;  it  has  no  knowledge  of  its 
materialization  or  of  the  distribution  and  redundancy  of 
stored  copies. 

The  workspace  into  which  data  is  read  is,  in  general, 
distributed.  That  is,  various  parts  of  the  workspace  may 
reside  at  different  data  modules.  In  SDD-1,  the  execution 
of  a  transaction  is  also,  in  general,  distributed; 
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processes  running  at  various  data  modules  operate  on  the 
portion  of  the  workspace  located  at  that  data  module. 
These  processes  run  concurrently  and/or  sequentially  with 


respect 

to 

one  another 

and 

transfer 

data 

between 

data 

modules 

as 

needed . 

The 

processes 

running  at 

the 

data 

modules 

are 

initiated 

and 

coordinated 

by 

the 

original 

transaction  module  to  which  the  transaction  was  submitted. 

This  function  is  performed  by  the  access _ planner 

sub-module  within  the  transaction  module.  The  access 
planner  converts  the  original  transaction  as  submitted  by 
the  user  into  a  number  of  local  data  management  processes 
running  at  the  data  modules  where  the  workspace  is  stored. 
The  algorithms  used  by  the  access  planner  are  described  in 
[WONG].  Again,  this  distribution  of  processing  is 
entirely  internal  to  SDD-1  and  is  not  reflected  in  the 
user's  transaction  in  any  way. 

To  process  a  transaction,  a  transaction  module  must  obtain 
the  read-set  data  for  the  transaction's  input  and  later 
write  its  output  into  copies  of  its  write-set.  These 
functions  are  performed  by  sending  READ  and  WRITE 
messages,  respectively,  to  data  modules. 


A  READ  message  for  a  transaction  is  sent  to  a  data  module 
and  is  a  request  to  read  some  of  the  stored  data  items  at 
that  data  module.  Each  stored  item  that  is  requested  must 
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be  the  particular  stored  copy  of  a  logical  data  item  in 
the  read-set  of  the  transaction  that  is  specified  by  the 
materialization  in  which  the  transaction  runs.  So,  if  a 
transaction  wants  to  read  logical  data  item  x,  and  the 
transaction's  materialization  associates  x  with  its 
particular  stored  copy  at  data  module  alpha,  then  to  read 
x  the  transaction  must  send  a  READ  message  to  alpha. 

A  WRITE  message  is  sent  from  a  transaction  module  to  a 
data  module  to  report  updates  that  have  taken  place  to 
certain  data  items  as  a  result  of  executing  a  transaction 
by  that  transaction  module.  If  a  transaction  updates  a 
particular  logical  data  item  x,  WRITE  messages  are  sent  to 
all  data  modules  that  have  a  stored  copy  of  x  (not  just  to 
the  one  stored  copy  associated  with  the  transaction's 
materialization) . 

A  transaction  module  sends  at  most  one  READ  message  and  at 
most  one  WRITE  message  to  any  particular  data  module  on 
behalf  of  a  single  transaction.  If  a  transaction  reads 
data  from  two  stored  fragments  which  reside  at  the  same 
data  module,  for  example,  then  only  one  READ  message  will 
be  issued  to  read  from  both  fragments.  This  is  an 
important  point,  as  each  data  module  must  perform  READ'S 
and  WRITE'S  as  atomic  operations;  for  example,  none  of 
the  data  read  by  a  READ  message  can  be  updated  by  some 
WRITE  while  the  READ  is  being  processed. 
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2.4  System  Consistency  Guarantees 


One  of  the  important  advantages  of  SDD-1  is  its  ability  to 
maintain  multiple  copies  of  the  same  logical  piece  of  data 
at  several  different  data  modules.  It  is  this  capability 
of  SDD-1  that  presents  the  most  difficult  technical 
problems.  The  system  must  maintain  the  consistency  of  all 
copies  of  data  and  ensure  that  the  READ  requests  for  a 
transaction  retrieve  a  correct  state  of  the  database.  In 
addition,  transactions  reading  or  writing  data  in  several 
data  modules  must  be  synchronized  to  ensure  that  a 
transaction  does  not  read  partial  results  of  another 
transaction.  If  transactions  are  allowed  to  run  in  an 
arbitrary  interleaved  manner  without  coordination,  various 
anomalies  in  system  operation  may  occur.  The  system 
design  guarantees  two  properties  which  prevent  these 
anomalies  from  occurring. 

System  Property  1:  Convergence  -  If  updates  were  to  be 
quiesced,  then  after  some  finite  period  of  time  all 
transactions  which  read  the  same  logical  data  item  will 
retrieve  the  same  value  for  it.  Essentially  this  means 


that  all  physical  copies  of  a  logical  data  item  will 
eventually  converge  to  the  same  value. 
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System  Property  2:  Serial  Reproducibility  (or 
Serializability)  -  The  operation  of  the  system  when 
running  transactions  in  an  interleaved  manner  is 
equivalent  to  a  history  of  operation  in  which  each  of  the 
transactions  runs  alone  to  completion  before  the  next  one 
begins.  That  is,  the  interleaved  operation  is 
reproducible  by  an  equivalent  one  in  which  the 
transactions  run  serially.  By  "equivalent",  we  mean  that 
each  transaction  produces  the  same  output  values  and  that 
the  final  state  of  the  database  is  the  same.  The  concept 
of  serial  reproducibility  is  crucial  to  an  understanding 
of  the  system  and  will  be  taken  up  in  detail  later. 


These  two 

system 

properties 

are  provided 

at 

the 

transaction 

module 

level.  That 

is,  the  set 

of 

all 

transactions  submitted  to  transaction  modules  must  satisfy 
these  properties.  The  terminal  modules  provide  a  level  of 
system  guarantee  beyond  that  of  the  transaction  module. 
These  guarantees  however  are  not  the  main  subject  of  this 


paper . 
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2.5  Terminal  Modules 


A  transaction  is  entered  at  a  terminal  and  is  received  by 
the  terminal  module  connected  to  that  terminal.  The 
terminal  module  must  determine  the  read-set  and  write-set 
of  the  transaction.  This  information  will  be  used  to 
decide  which  transaction  module  should  execute  the 
transaction,  as  each  transaction  module  handles  only 
certain  classes  of  transactions.  For  example,  in  an 
airline  reservation  system,  each  transaction  module  may 
execute  transactions  corresponding  to  flights  originating 
at  a  certain  city.  By  examining  the  read-set  and 
write-set  of  a  reservation  transaction,  a  terminal  module 
can  determine  the  originating  city  and  thereby  is  able  to 
choose  an  appropriate  transaction  module  to  execute  the 
transaction . 

The  terminal  module  makes  sequencing  guarantees  above  and 
beyond  those  of  the  transaction  modules.  The  terminal 
module  incorporates  certain  synchronization  information 
with  the  transaction  before  sending  it  to  a  transaction 
module.  This  information  allows  the  transaction  module  to 
avoid  certain  sequencing  anomalies  with  respect  to  other 
transactions  entered  at  the  same  terminal. 
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The  main  body  of  this  paper  however  concerns  the  design 
and  interaction  of  the  transaction  modules  and  data 
modules.  For  convenience,  transaction  modules  and  data 
modules  will  be  referred  to  as  TM's  and  DM's, 
respectively,  in  the  sequel. 
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2.6  Timestamps 


System  property  1,  convergence,  is  provided  in  SDD-1 
through  the  use  of  a  timestamping  mechanism.  Each  TM  has 
a  clock  used  for  generating  globally  unique  timestamps. 
After  a  clock  has  been  read,  it  cannot  be  read  again  until 
it  has  been  incremented.  By  appending  the  TM  number  as 
the  low  order  bits  of  each  timestamp,  we  ensure  that  every 
timestamp  is  globally  unique  within  the  system.  This 
method  of  generating  unique  timestamps  was  suggested  in 
[THOMAS] . 

None  of  the  mechanisms  described  in  this  report  require 
that  clocks  running  in  different  TM's  be  at  all 
synchronized.  For  reasons  of  efficiency  however  it  is 
necessary  to  assume  that  clock  values  in  different  TM's  be 
reasonably  close  to  each  other.  In  [Lamport]  a  method  of 
synchronizing  clocks  in  a  network  is  described  that 
involves  pushing  ahead  a  local  clock  if  a  message  with  a 
future  timestamp  is  received.  This  simple  method  will 
keep  clocks  sufficiently  well  synchronized  for  the 
purposes  of  SDD-1 . 
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Each  transaction,  before  being  run,  is  assigned  a  unique 
timestamp.  The  transaction ' s  timestamp  will  be  carried  on 
all  its  WRITE  messages. 

In  addition,  timestamps  are  maintained  for  every  updatable 
physical  data  item  in  the  database.  Note  that  a  timestamp 
is  associated  with  each  physical  data  item,  rather  than 
with  the  logical  data  item;  there  may  be  many  physical 
copies  of  a  logical  data  item  and  each  copy  of  the  logical 
data  item  has  its  own  timestamp.  This  timestamp  is  the 
timestamp  of  the  last  WRITE  message  which  updated  that 
physical  data  item. 

In  order  to  implement  property  1,  convergence,  each  data 
module  obeys  the  following  rule:  A  data  item  is  updated 
by  a  WRITE  message  if  and  only  if  the  data  item's 
timestamp  is  less  than  the  timestamp  of  the  WRITE  message. 
So,  to  process  a  single  WRITE  message  at  a  data  module  the 
following  procedure  is  used.  For  each  data  item  in  the 
WRITE  message,  the  timestamp  in  the  WRITE  message  is 
compared  with  the  timestamp  of  the  stored  data  item  at 
that  data  module.  If  the  timestamp  in  the  WRITE  message 
is  greater  than  the  timestamp  of  the  stored  data  item, 
then  the  new  value  of  the  data  item  in  the  WRITE  message 
is  written  into  the  stored  data  item  with  the  new 
timestamp.  If  the  timestamp  of  the  WRITE  message  is  less 
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than  the  timestamp  of  the  stored  data  item,  then  the 
update  is  not  performed  on  that  data  item.  This  is  a  data 
item  by  data  item  check;  some  data  items  in  the  WRITE 
message  may  result  in  update  operations  while  others  may 
not.  Also,  if  a  data  item  in  the  WRITE  message  is  part  of 
a  fragment  that  is  not  stored  at  the  data  module,  then  the 
update  is  not  performed. 

It  will  be  quite  common  for  WRITE  messages  to  contain  many 
data  item  updates  that  are  not  performed.  This  will 
happen  when  a  WRITE  message  for  a  recent  transaction  that 
updates  some  data  item  is  processed  at  a  DM  before  a  WRITE 
message  for  an  earlier  (i.e.,  older)  transaction  that 
updates  the  same  data  item.  Such  situations  are  not 
errors.  They  are  simply  the  way  that  the  system  reorders 
updates  to  occur  in  the  same  order  that  they  actually 
executed . 
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2.7  Interleaved  Transactions 


The  system  usually  has  many  transactions  in  progress  at 
any  one  time,  both  because  there  are  multiple  TM's 
operating  concurrently  within  the  system  and  because 
individual  TM's  are  processing  transactions  concurrently . 
The  resulting  arbitrary  interleavings  of  READS  and  WRITES 
can  introduce  serious  problems  of  database  consistency. 
System  Property  2,  serial  reproducibility ,  deals  with  this 
problem . 

The  issue  of  serial  reproducibility  arises  because  a 
system's  atomic  actions  are  at  a  finer  granularity  than 
its  users'  atomic  actions.  In  our  case,  the  users'  atomic 
operations  are  user  transactions,  while  the  system's 
atomic  actions  can  be  taken  to  be  the  execution  of  READ 
and  WRITE  messages  at  the  DM's.  Each  DM  behaves  as  if 
READ'S  and  WRITE's  are  processed  as  indivisible  units. 
That  is,  it  is  not  possible  for  a  READ  operation  to 
observe  the  effects  of  only  a  part  of  a  WRITE  operation  at 
a  DM. 


When  a  system  allows  the  execution  of  several  user 
transactions  at  the  same  time,  then  the  system  atomic 
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operations  cor respond ing  to  different  user  transactions 
are  interleaved.  There  is  no  guarantee  that  the  behavior 
of  such  a  system  conforms  to  the  user's  expectation  that 
each  transaction  is  treated  as  an  indivisible  unit  (a 
user's  transaction  should  not  examine  the  database  during 
the  execution  of  another  user’s  transaction,  when  the 
database  is  possibly  in  an  inconsistent  state)  . 

Serial  reproduc ibil ity  requires  that  a  system  operating  in 
an  interleaved  manner  is  equivalent  to  a  system  in  which 
each  transaction  is  processed  in  its  entirety  before 
another  one  is  begun.  In  other  words,  for  any  given 
interleaved  execution,  there  exists  an  ordering  of  atomic 
transactions,  called  a  serial  ordering,  which  is 
equivalent  to  the  interleaved  operation  which  in  fact 
occurs.  By  "equivalent"  we  mean  that  each  transaction  in 
the  interleaved  ordering  reads  the  same  data  as  it  would 
have  read  if  the  transactions  had  been  run  one  at  a  time 
in  the  serial  order  (and  hence,  will  produce  the  same 
output).  Note  that  serial  reproducibil ity  requires  only 
that  there  exists  some  serial  order  equivalent  to  the 
actual  interleaved  operation.  There  may  in  fact  be 
several  such  equivalent  serial  orderings. 

The  modelling  of  correct  concurrent  operation  by  the 
concept  of  serial  reproduc ibil ity  is  based  on  the 
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assumption  that  each  user  transaction  will  preserve 
database  consistency  if  it  runs  atomically.  That  is,  if 
only  one  transaction  were  allowed  to  execute  at  a  time, 
and  if  the  database  state  is  consistent,  then  after 
executing  a  transaction  the  database  state  will  be 
consistent.  So,  a  serial  ordering  of  transaction 
executions  will,  by  induction,  result  in  a  consistent 
database  state.  Since  a  serially  reproducible  history  of 
operation  is  equivalent  to  some  serial  ordering,  then  the 
serially  reproducible  history  results  in  a  consistent 
database  state  as  well. 

If  a  system  does  not  guarantee  serial  reproducibility  then 
anomalies  can  result  from  operation  of  the  system. 
Consider,  for  example,  the  following  scenario  in  SDD-1. 
We  assume  a  single  copy  of  data  item  x,  which  initially 
has  the  value  x=0.  There  are  two  transactions  in  the 
system;  transaction  i  sets  x:=x+1,  and  transaction  j  sets 
x:=x+2.  The  following  sequence  of  events  occurs: 
Transaction  i  reads  x=0 
Transaction  j  reads  x=0 
Transaction  j  sets  x:=2 
Transaction  i  sets  x:=1 

Any  execution  of  the  two  transactions  one  after  the  other 
would  have  resulted  in  setting  x  to  3.  The  result  of  the 
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interleaved  execution  was  to  set  x  to  1 ,  contrary  to  the 
user’s  intention.  To  guarantee  serial  reproducibility ,  we 
need  a  mechanism  that  prevents  these  kinds  of  undesirable 
interleavings . 
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2.8  Transaction  Classes 


The  problem  of  interleaved  transactions  is  not  unique  to 
distributed  systems.  Numerous  solutions  have  been  devised 
for  non-distributed  systems,  most  notably  locking 
mechanisms.  These  techniques  do  not,  however,  generalize 
well  to  distributed  systems.  A  number  of  proposals  have 
been  suggested  for  extending  locking  mechanisms  to 
distributed  systems  that  contain  redundant  data.  These 
techniques  are  reviewed  in  [ ROTHNIE  and  GOODMAN].  We 
feel,  however,  that  such  techniques  require  unacceptably 
large  amounts  of  network  transmission  and  delay  whenever 
there  is  considerable  data  redundancy. 


Yet  at  first 

glance  the  network  transmission  seems  to  be 

necessary . 

How  can 

one 

TM  safely  proceed 

to  run  a 

transaction 

without 

first 

consulting  other 

TM's  to 

determine 

that  it 

does 

not  interact 

badly  with 

transactions  currently  executing  elsewhere? 

Our  solution  to  this  problem  is  to  have  the  DBA  establish 
a  static  set  of  transaction  classes.  Each  transaction* 
class  is  defined  in  terms  of  its  logical  read-set  and 
write-set  and  is  assigned  to  run  at  a  particular  TM.  A 
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transaction  can  run  in  a  class  if  the  read-set  and 
write-set  of  the  transaction  is  contained  (respectively) 
in  the  read-set  and  write-set  of  the  class.  Classes  need 
not  be  disjoint,  so  a  transaction  may  fit  into  more  than 
one  class.  In  this  case,  the  decision  as  to  which  class 
should  be  chosen  is  made  by  the  terminal  module  that 
accepts  the  transaction.  The  terminal  module  will 
normally  choose  a  class  that  requires  the  least  amount  of 
synchronization,  and  is  therefore  the  least  expensive 
class  (synchronization-wise)  to  use. 

The  predefined  classes  reflect  the  typical  transactions 
that  are  intended  to  run  at  each  site  in  the  network. 
Since  each  TM  is  aware  of  the  complete  set  of  transaction 
classes  assigned  to  foreign  transaction  modules,  it  can 
know  exactly  what  potential  conflicts  its  own  transactions 
have  with  those  that  might  be  running  at  other  TM's. 

From  the  information  contained  in  the  class  definitions,  a 
TM  can  determine  the  degree  and  nature  of  coordination 
necessary  to  ensure  a  serially  reproducible  ordering  of 
transactions.  We  believe  that,  for  many  kinds  of 
applications,  the  most  frequent  determination  will  be  that 
no  coordination  whatsoever  is  actually  required  to  run  a 
transaction.  In  such  a  case,  the  transaction  is  just 


immediately  executed,  since  it  does  not  interact  badly 
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with  transactions  submitted  elsewhere.  In  other  cases,  an 
analysis  of  the  class  definitions  might  indicate  that  the 
pending  transaction  could  be  involved  in  a  potential 
conflict  and  some  coordination  is  necessary  with  respect 
to  particular  foreign  classes.  Our  purpose  here  is  to 
develop  a  method  of  determining  exactly  what  conflicts 
occur  and  to  provide  coordination  mechanisms  that 
eliminate  the  conflict. 

If  the  problem  of  determining  exactly  what  conflicts  might 
occur  required  run-time  calculations  when  each  transaction 
was  introduced  at  a  class,  then  the  concurrency  control 
mechanism  would  potentially  be  quite  expensive.  Actually, 
since  the  class  definitions  are  static,  the  computations 
checking  for  potential  conflicts  can  be  done  once,  when 
the  class  definitions  are  selected.  Selecting  the 
appropriate  coordination  mechanism  at  run-time  amounts  to 
a  table  look-up.  So,  the  only  significant  run-time 
overhead  is  the  coordination  mechanism  itself.  If  no 
coordination  is  found  to  be  necessary,  then  the  run-time 
overhead  is  negligible.  This  is  in  contrast  to  locking 
mechanisms  which  always  set  locks,  whether  or  not  the 
synchronization  is  really  required. 
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2.10  Class  Conflict  Graphs 


Given  the  set  of  class  definitions,  we  need  to  detect 
potentially  harmful  interactions  between  classes.  The 
approach  used  to  resolve  these  questions  involves  the 
construction  and  analysis  of  a  class  conflict  graph. 

A  class  definition  specifies  a  logical  read-set  and 
write-set  and  a  materialization.  This  is  the  only 
information  required  to  determine  class  conflicts.  From 
the  read-set  and  the  materialization,  the  READ  messages 
needed  by  the  class  can  be  predicted.  From  the  write-set, 
the  WRITE  messages  needed  by  the  class  can  be  predicted, 
since  a  WRITE  message  must  be  sent  to  all  copies  of  the 
logical  write-set.  Since  all  READ  and  WRITE  messages  are 
predictable,  we  will  be  able  to  predict  all  possible 
harmful  interactions  between  classes. 

A  class  is  represented  in  the  class  conflict  graph  as 
three  types  of  nodes  connected  by  edges.  The  three  types 
of  nodes  are  e,  r  and  w  nodes. 

An  e  node  represents  the  execution  of  a  transaction  which 
runs  in  the  class.  A  class  superscript  (e.g.  e1) 
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designates  the  class  identifier  for  the  transaction  class. 
(Throughout  this  report,  transactions  will  be  indicated  by 
lower  case  letters,  and  transaction  classes  by  lower  case 
letters  with  an  overscore.)  The  graph  includes  exactly 
one  e  node  per  class. 


An  r  node  represents  the  processing  of  a  READ  message  to 
retrieve  data  for  transactions  in  the  class.  A 
superscript  represents  the  class  identifier  and  a 


subscript  indicates  to  which  DM  the  READ  message  would  be 


sent  (e.g 


ralpha  represents  a  READ  message  from  a 


transaction  in  class  j  to  DM  ,  .  )  .  (Lower  case  Greek 

alpha 

letters  denote  DMs.)  For  any  class,  there  is  one  r  node 
for  each  DM  which  stores  part  of  the  class's  (physical) 


read-set . 


A  w  node  represents  the  processing  of  a  WRITE  message 
issued  by  a  transaction  running  in  the  class.  Again,  a 
superscript  indicates  the  class  identifier  and  a  subscript 
indicates  the  DM  to  which  the  WRITE  message  would  be  sent 
(e.g.  wgamma^ *  F°r  any  class,  there  is  one  w  node  for 
each  DM  on  which  a  copy  of  (some  of)  the  write-set  items 
lie. 


Edges  connect  the  e  node  for  a  particular  class  with  the  r 
and  w  nodes  for  that  class.  These  edges  are  called 
vertical  edges,  because  of  the  convention  that,  for  each 
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class,  r  nodes  are  drawn  above  the  e  node  and  w  nodes  are 
drawn  below  the  e  node. 

Figure  2.2  illustrates  the  representation  of  a  class  whose 
read-set  lies  in  two  datamodules  and  whose  write-set  lies 
on  four  datamodules. 

After  all  the  predefined  transaction  classes  have  been 
placed  in  the  graph,  additional  edges  are  added  to 
indicate  interactions  between  the  classes. 


There  are  two  READ  messages,  one  to  DM^ 
and  the  other  to  DM,, 

This  is  transaction  class  ^14 

Data  must  be  written  to  four  DM' a: 


Figure  2.2  Representing  transaction  classes  in  the  graph 


Page  -32- 
Section  2 


SDD-1  Concurrency  Control  Mechanism 
The  SDD-1  Architecture 


Where  two  classes  have  a  read/write  intersection,  a 
diagonal  edge  is  drawn.  The  edge  is  drawn  between  an  r 
node  which  represents  the  reading  of  some  particular 
physical  data  item  and  a  w  node  which  represents  the 
writing  of  that  same  item  (see  Figure  2.3).  Note  that 
such  a  diagonal  edge  only  connects  r  and  w  nodes  with  the 
same  DM  subscript,  since  a  physical  data  item  resides  at 
only  one  DM.  If  the  intersection  of  one  class's  read-set 
and  another's  write-set  spans  more  than  one  DM,  then 
several  diagonal  edges  connect  the  two  classes  (see  Figure 
2.4)  . 
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Figure  2.5  -  A  horizontal  edge  is  added  to  the 
when  two  classes  write  the  same  data  item. 


that  the  classes  can  interact  in  such  a  way  that  there  is 
no  serial  ordering  of  transactions  that  is  equivalent  to 
the  interleaved  eAecution  that  actually  occurred.  The 
interpretation  of  the  diagonal  and  horizontal  edges 
applied  to  a  given  interleaved  execution  is  the  key  to 
determining  transaction  ser ial i zab i 1 i ty . 
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2.11  Graph  Cycles  and  Nonserializability 


Suppose  the  system  executes  in  a  manner  that  permits  the 
interleaving  of  READ  and  WRITE  messages  from  different 
transactions.  We  call  such  an  interleaved  execution  a 
log .  If  the  execution  is  not  interleaved,  that  is,  if 
transactions  execute  serially  one  after  the  other,  then  we 
call  the  execution  a  serial  log.  Our  goal  is  to  only 
permit  the  system  to  produce  logs  that  are  serially 
reproducible .  This  means  that  for  each  log  resulting  from 
the  execution  of  the  system,  there  must  exist  a  serial  log 
that  produces  the  same  effect  on  the  database.  We  say 
that  two  logs  are  equivalent  if  they  produce  the  same 
effect  on  the  database. 

Of  course  if  the  transactions  in  a  log  are  arbitrarily 
reordered  into  a  serial  log,  the  resulting  serial  log  will 
not  necessarily  be  equivalent  to  the  given  log.  The 
conflict  graph  helps  us  to  characterize  precisely  those 
serial  logs  that  produce  the  same  effect  as  a  given  log. 

Consider  diagonal  graph  edges.  A  diagonal  edge  represents 
a  read/write  intersection  between  two  classes.  If  one 


transaction  from  each  of  the  two  classes  appears  in  the 
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given  log,  then  in  any  equivalent  serial  log  the 
transactions  should  appear  in  the  same  relative  order  as 
their  intersecting  READ  and  WRITE  messages  were  processed 
in  the  given  log.  For  if  the  READ  message  of  one 
transaction  preceded  the  WRITE  message  of  the  other  in  the 
given  log,  but  the  transactions  appear  in  the  reverse 
order  in  the  serial  log,  then  in  the  serial  log  the  READ 
message  may  read  different  values  for  some  of  its  inputs 
in  the  serial  log  than  reads  in  the  given  log.  So,  the 
transaction  correspond ing  to  the  READ  may  produce  a 
different  output  in  the  serial  log  than  in  the  given  log. 
That  is,  the  two  logs  are  not  necessarily  equivalent. 
This  is  just  to  say  that  only  some  serial  reorderings  of 
the  given  log  are  possible,  given  the  existence  of  this 
diagonal  edge.  (Actually,  the  above  claim  about 
permissible  serial  reorderings  is  somewhat  too  strong,  as 
shown  in  [ PAPADIMITRIOU  et  al].  However,  the  reasons  are 
quite  technical  in  nature  and  are  not  needed  to  gain  an 
understanding  of  the  interpretation  of  conflict  graphs.) 

Consider  classes  I  and  j  in  figure  2.3.  We  denote  READ 
and  WRITE  messages  using  a  notation  similar  to  that  of 
node  labels.  The  processing  of  the  READ  message  for 

transaction  i  at  DMaipha  is  denoted  Ralpha;  the 
processing  of  the  WRITE  message  for  transaction  i  at 

“"alpha  is  denoted  “alpha- 
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Assume  two  transactions,  say  i  and  j,  are  running 

concurrently  in  classes  I  and  j  respectively.  If  the  READ 

message  is  processed  at  DMa^p^a  before  the  WRITE 

message  waipha  is  processed,  then  any  equivalent  serial 

ordering  must  have  transaction  i  precede  transaction  j. 

This  must  be  so,  for  otherwise  transaction  i  would  have 

read  the  results  of  the  update  made  by  transaction  j.  On 

the  other  hand,  if  the  WRITE  message  W^,  ,  is  processed 

alpha  K 

before  the  READ  message  Rg^p^g*  then  transaction  j  must 
precede  transaction  i. 


To  reiterate,  a  diagonal  edge  implies  a  particular 
relative  ordering  in  any  serial  log  that  is  equivalent  to 
the  given  interleaved  execution.  The  particular  ordering 
that  is  chosen  depends  on  the  particular  order  in  which 
READ  and  WRITE  messages  were  processed;  however  the 
relative  serial  ordering  of  transactions  from  classes  with 


a  di a gona ledge  _c onnect them  is  no t_  arbitrary . 


Horizontal  edges  also  affect  possible  reorderings  of 
transactions.  A  horizontal  edge  indicates  an  intersection 
of  write-sets.  Whenever  two  transactions  write  the  same 
data,  the  update  from  the  transaction  with  the  greater 
(i.e.  later)  timestamp  takes  precedence  over  the  update 
from  the  transaction  with  the  smaller  (i.e.  earlier) 
timestamp.  If  two  transactions  in  different  classes 
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appear  in  an  interleaved  execution  and  have  a  write/write 
intersection,  then  they  must  appear  in  timestamp  order  in 
any  equivalent  serial  log.  Otherwise,  the  effect  of  the 
intersecting  write  messages  would  be  reversed,  thereby 
producing  a  different  database  state.  Notice  that  it  is 
the  timestamp  order  of  the  transactions  and  not  the  order 
in  which  the  WRITE  messages  were  processed  that  is 
significant  here.  This  is  because  the  rule  by  which  WRITE 
messages  are  processed  uses  the  timestamps,  not  the  order 
of  arrival  of  the  WRITE  messages,  to  determine  which  write 
operations  are  actually  applied. 

So,  a  horizontal  edge  also  implies  a  particular  relative 
ordering  of  certain  transactions  in  any  serial  log  that  is 
equivalent  to  the  given  interleaved  execution.  This 
ordering  is  always  the  timestamp  ordering  of  the 
transactions  that  have  the  write/write  intersection. 

In  the  same  way  that  diagonal  and  horizontal  edges 
restrict  the  ways  in  which  transactions  can  be  reordered 
without  upsetting  the  resulting  database  state,  paths  of 
edges  can  restrict  reorderings  of  transactions  as  well. 
For  example,  a  particular  diagonal  edge  may  imply  that 
transaction  i  must  precede  transaction  j  and  an  adjacent 
horizontal  edge  may  indicate  that  transaction  j  must 
precede  transaction  k  (see  figure  2.6).  So,  the  net 
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effect  of  this  path  is  that  transaction  i  must  precede 
transaction  k,  even  though  no  single  edge  may  connect 


their  respective  classes  in  the  conflict  graph. 


Now,  suppose  again  that  we  have  a  conflict  graph  and  a  log 
of  interleaved  transactions.  Suppose  that  for  each  pair 
of  transactions,  say  i  and  j,  the  log  and  graph  edges 
never  imply  both  that  i  must  precede  j  and  that  j  must 
precede  i  in  the  serial  reordering.  That  is,  either  i  and 
j  can  appear  in  an  arbitrary  order,  or  there  is  only  one 
order  that  will  do.  Then  it  is  easy  to  see  that  there 
must  be  a  serial  log  equivalent  to  the  given  log.  Any 
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serialization  that  preserves  the  relative  orderings  that 
are  demanded  by  the  graph  serves  the  purpose. 

However,  suppose  instead  that  there  are  two  transactions 
such  that  one  path  in  the  graph  requires  that  they  appear 
in  one  order  and  another  path  in  the  graph  requires  that 
they  appear  in  the  other  order.  Then  there  is  no 
equivalent  serial  log  that  includes  these  two 
transactions,  for  whatever  order  that  they  appear  in  the 
serial  log,  the  graph  indicates  that  they  must  also 
appear  in  the  other  order.  In  this  case,  there  are  two 
different  paths  connecting  the  two  transactions'  classes 
in  the  graph.  These  two  paths  constitute  a  cycle  in  the 
graph.  So,  apparently  a  cycle  in  the  graph  corresponds  to 
a  non-ser ializable  execution  of  transactions.  If  there 
are  no  cycles,  then  there  is  at  most  one  path  connecting 
any  pair  of  classes.  Hence,  the  graph  can  only  require 
that  two  transactions  be  serialized  one  way  or  the  other, 
but  never  both  ways.  So,  a  cycle-free  graph  implies  that 
every  log  is  serializable,  and  no  synchronization 
whatsoever  is  required.  The  preceding  informal  argument 
demonstrating  this  fact  will  be  proved  quite  rigorously  in 
Section  4. 


Consider  the  cycle  in  Figure  2.7  consisting  of  two 
diagonal  edges  and  four  vertical  edges.  If  we  examine  a 
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Figure  2.7  -  Cycles  represent  situations  in  which 
non-serializability  is  possible. 


case  of  concurrent  transactions  in  each  of  the  two  classes 

and  the  particular  sequence  of  events  in  which  the  READ 

message  R^eta  is  processed  before  the  WRITE  message  W^gtg, 

and  the  READ  message  R)L„„„  is  processed  before  the  WRITE 

gamma 

message  W1  ,  then  there  is  no  serial  ordering  of  the 
gamma 

two  transactions  which  is  equivalent  to  their  interleaved 

V  T 

ordering.  This  follows  because  the  rbeta“wbeta  ed8e 
requires  that  the  transaction  in  class  T  occurs  before  the 

T  T 

transaction  in  class  j ,  yet  the  rd _ -w* _ „  edge  implies 

’  gamma  gamma  °  r 

the  opposite  relative  ordering.  Therefore,  it  must  be  the 
case  that  no  equivalent  serial  ordering  exists. 
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We  have  shown  that  potentially  dangerous  interleavings  can 
be  identified  by  a  cycle  in  the  class  conflict  graph.  So, 
as  long  as  no  cycles  exist,  the  class  pipelining  rule  is 
sufficient  to  guarantee  ser ializability .  Where  cycles  do 
exist,  some  synchronization  among  classes  is  required.  In 
SDD-1,  this  synchronization  is  accomplished  by  protocols. 
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2.12  Protocol  P3 


When  a  cycle  exists  in  the  conflict  graph,  then  an 
interleaved  execution  might  be  such  that  a  pair  of 
transactions,  i  and  j,  must  be  serialized  with  i  preceding 
j  and  j  preceding  i,  clearly  an  impossibility.  Protocol 
P3  prevents  this  situation  by  making  the  following 
guarantee:  If  two  transactions  belong  to  two  classes 
connected  by  a  diagonal  edge  in  a  cycle,  then  the 
timestamp  order  of  the  two  transactions  is  the  same  as  the 
relative  ordering  dictated  by  the  diagonal  edge .  For 

T  T 

example,  suppose  the  edge  (ralpha>  waipha^  lies  on  a  cycle 
and  transaction  i  executes  in  class  I  and  j  executes  in 
class  j.  Then,  assuming  protocol  P3  is  observed,  Raipha 
is  processed  before  if  and  only  if  the  timestamp  of 
i  is  smaller  that  the  timestamp  of  j.  Before  describing 
how  P3  accomplishes  this  task,  let  us  first  examine  how  P3 
prevents  nonserializable  executions. 


Consider  again  transaction  i  and  j  above.  Since  they 


apparently  must  be  serialized  in  both  orders,  there  must 
be  two  independent  paths  connecting  them  in  the  graph, 
such  that  one  path  requires  that  i  precede  j  and  the  other 
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requires  that  j  precede  i.  Suppose  the  timestamp  of  i  is 
smaller  than  that  of  j.  So,  the  path  that  requires  j  to 
precede  i  in  the  serial  reordering  is  trying  to  serialize 
them  in  reverse  timestamp  order.  But  suppose  every 
transaction  pair  connected  by  a  diagonal  edge  in  this  path 
observes  P3.  Then  each  such  pair  must  be  serialized  in 
timestamp  order,  as  P3  requires.  Consider  a  pair  of 
transactions  connected  on  the  path  by  a  horizontal  edge. 
Following  the  discussion  about  horizontal  edges  in  the 
last  section,  they  too  must  be  serialized  in  timestamp 
order.  Thus,  every  pair  of  transactions  in  the 
interleaved  execution  that  corresponds  to  a  graph  edge 
along  this  path  must  be  serialized  in  timestamp  order. 
The  net  effect  (by  induction  on  the  length  of  the  path)  is 
that  the  entire  path  requires  that  i  and  j  be  serialized 
in  timestamp  order.  But  this  is  a  contradiction,  since 
the  chosen  path  was  one  that  required  the  transactions  to 
be  serialized  in  reverse  timestamp  order  .  The  conclusion 
is  that  all  paths  in  the  graph  between  I  and  j  require 
that  i  and  j  be  serialized  in  timestamp  order.  Protocol 
P3  prevents  the  case  that  there  are  two  independent  paths 
between  I  and  j  that  require  opposite  relative  orderings. 

To  implement  protocol  P3,  we  need  to  synchronize  the  READ 
and  WRITE  messages  of  transactions  that  correspond  to  the 
endpoints  of  a  diagonal  edge  in  a  cycle.  To  explain  the 
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▼  ** 

operation  of  P3,  suppose  that  the  edge  (raipha*  walpha^  is 
a  diagonal  edge  in  a  cycle;  so,  for  each  transaction  i  in 
class  I,  R^ipha  has  to  run  against  class  j  at  DMaipha' 
This  is  accomplished  by  appending  a  read  condition  to  each 


read  message  R1.  . 

alpha 


The  read  condition  includes  the 


timestamp  of  transaction  i,  say  TS^  and  the  name  of  the 
class  against  which  P3  is  being  run,  in  this  case  3-  A 
data  module,  upon  encountering  a  READ  message  with  the 
attached  read  condition  <  TS^j  >,  must  not  process  the 
READ  until  it  is  certain  that  all  WRITE  messages  from  j 
with  timestamps  prior  to  TSi  have  been  received  and 
processed ,  and  that  it  has  not  processed  any  WRITE 
messages  from  j  with  a  timestamp  greater  than  TS^  This 
ensures  that  the  READ  messages  Rgipha  processed  before 
a  WRITE  message  from  j  if  and  only  if  TS^^  is  smaller  than 
the  timestamp  of  the  transaction  corresponding  to  the 
WRITE  message.  That  is,  it  guarantees  that  the  diagonal 
edge  forces  transactions  from  the  two  classes  to  be 
serialized  in  timestamp  order.  We  refer  to  this  mechanism 
as  protocol  P3,  and  would  say,  for  example,  that 
transactions  in  class  I  run  protocol  P3  against 


transactions  in  class  j  at  DM 


alpha ' 


Several  problems  arise  about  the  operation  of  protocol  P3. 
Suppose  the  DM  has  already  processed  a  WRITE  message  from 
the  specified  class  j  with  a  timestamp  greater  than  TS^. 
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In  this  case,  the  READ  message  must  be  rejected  by 

DM  .  .  .  The  initiating  TM  then  assigns  a  new  timestamp 

alpha  o  o  i' 

to  the  transaction  and  resubmits  its  READ  requests. 
Notice  that  all  READ  messages  must  be  resubmitted  if  any 
READ  message  is  rejected. 


A  more  serious  problem  is  how  to  guarantee  that  a  DM  has 
received  all  WRITE  messages  through  some  particular  time. 
The  solution  lies  in  the  class  pipelining  rule.  Recall 
that  READ  and  WRITE  messages  from  a  class  to  a  DM  must  be 


processed  in  timestamp  order, 
all  WRITE  messages  from  j  up 
simply  processes  all  WRITE 


If  DM  ,  .  wants  to  process 
alpha 

to  but  not  past  time  TS^  it 
messages  from  3  until  it 


receives  one  with  a  timestamp  greater  than  TS^ 


this  WRITE  message 
satisfying  the  read 


until  R1.  .  is  processed, 
alpha  r 

condition  attached  to  R1.  .  . 

alpha 


It  holds 
thereby 


Unfortunately,  if  class  j  is  idle  because  it  has  no 
transactions  to  process,  DMaipha  may  need  to  wait  for  a 
long  time  until  a  message  timestamped  later  than  TS^ 
arrives  from  j-  To  handle  this  problem  we  have  TM ' s  send 


out  NULLWRITE  messages  to  appropriate  DM ' s .  A  NULLWRITE 
message  specifies  a  class  and  a  timestamp.  It  is 


semantically  equivalent  to  a  WRITE  message  that  does  not 
update  any  data.  When  a  DM  receives  such  a  NULLWRITE 
message,  it  can  be  sure  that  it  has  received  all  WRITE 
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messages  from  the  indicated  class  through  the  given 
timestamp . 


TM ' s  will  send  out  NULLWRITEs  on  a  periodic  basis.  In 
addition,  a  TM  may  be  specifically  requested  to  send  a 
NULLWRITE  for  a  particular  class  and  timestamp.  This 
specific  request  is  in  the  form  of  a  SENDNULL  message  and 
may  be  sent  by  either  another  TM  or  a  DM.  A  discussion 
and  analysis  of  various  strategies  for  sending  NULLWRITE 
and  SENDNULL  messages  will  appear  in  a  later  report. 

To  illustrate  the  use  of  protocol  P3  for  eliminating  bad 
interleaved  executions,  let  us  reconsider  the  anomalous 
scenario  discussed  in  section  2.7,  this  time  adding  a  bit 
more  structure  to  the  problem. 


We  assume  a  single  copy  of  data  item  x,  residing  at 

DMalpha’  with  value  x  =  0.  Class  I  has  been  defined 

to  run  at  TM  ,  .  with  read-set  =  { x }  and  write-set  =  {x}. 
alpha 

Class  j  has  been  defined  to  run  at  TM.  ,  with  read-set  = 

beta 

{ x }  and  write-set  =  {x}.  The  class  graph  in  this 
situation  is  shown  in  figure  2.8.  Notice  that  a  cycle  is 
present  and  that  transactions  in  class  I  must  run  P3 
against  class  j  and  that  transactions  in  class  j  must  run 
P3  against  class  I. 
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'•  ™alpha  sends  a  RE4D  nessa8e-  Ralpha'  t0  “alpha 
to  retrieve  the  value  of  data  item  x  for  transaction 

i.  This  READ  includes  a  P3  read  condition  against 

class  j.  The  READ  message  cannot  be  immediately 

processed  because  WRITE  messages  through  time  TS . 


from  class  3  have  not  yet  been  received  at  DM 


alpha  * 


2-  ™beta  sends  a  READ  message,  Rdlpha,  to  DMglpha  to 
retrieve  the  value  of  data  item  x  for  transaction  j. 
The  READ  message  can  be  immediately  processed  (the 


presence  of  a  class  I  READ  message  at  D^aipha  with 
timestamp  TS .  >  TS  ■  insures  that  all  WRITE  messages 
from  class  I  have  been  received  through  time  TS  . ) . 
The  result  of  the  READ  is  x=0. 


3-  TM,  ,  sends  a  WRITE  message  for  transaction  j  to 

DC  l>3 

“alpha  ^tting  x:=2. 

4.  TM.  .  sends  a  NULLWRITE  message  to  DM  ,  .  with 
beta  alpha 

timestamp  TS j ,  >  TSi  .  (This  message  may  be  a 

response  to  a  SENDNULL  request  from  TM  ,  ,  .  The 

class  pipelining  rule  requires  that  this  message 

could  not  be  sent  before  the  WRITE  message  with  time 

TS.  <  TS  , ,  )  .  The  READ  message  for  transaction  i  can 

sj 

now  be  processed.  (The  presence  of  the  NULLWRITE 

message  at  DMalpha  with  timestamp  TS j ,  >  TS^ 

satisfies  the  P3  read  condition.)  The  result  of  the 


READ  is  x=2. 
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5.  "Majpha  sends  a  WRITE  message  for  transaction  i  to 
DMalpha  settin§  x:=3.  Notice  that  this  WRITE  message 
overwrites  the  earlier  value  of  x  =  2  because  the 


earlier  value  was  associated  with  timestamp  TS  .  and 

*3 

the  current  WRITE  message  has  timestamp  TS.>TS.. 

^  J 


The  final  value  of  data  item  x  is  3,  as  expected.  The 
anomalous  interleaving  that  was  described  in  the  example 
of  section  2.7  has  been  prevented  by  the  use  of  protocol 
P3. 


We  have  seen  that  by  locating  graph  cycles,  by  finding 


every 

class 

that  lies 

at 

the 

r-end  of  a  diagonal 

edge 

embedded  in 

a  cycle,  and 

by 

having 

transactions 

in 

that 

class 

run 

protocol 

P3, 

we  can  guarantee 

that 

all 

interleaved 

executions 

will 

be 

serializable . 

However , 

there 

are 

situations 

in 

which 

weaker  protocols  (i 

.  e .  , 

protocols  that  allow  more  concurrency)  than  P3  may  be 
used.  This  leads  us  to  a  discussion  of  protocols  P2  and 


P2f . 
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2.13  Protocol  P2 


The  main  opportunity  for  weakening  the  P3  protocol  arises 
in  connection  with  the  transactions  that  participate  in  a 
conflict  graph  only  with  their  read-nodes.  These 
read-only  transactions  contribute  to  non-serializability 
only  because  they  may  observe  certain  WRITE  messages  being 
processed  in  reverse  timestamp  order.  For  example, 
suppose  we  have  classes  I,  3,  and  R  connected  by  the  edges 

T  T  “T  — 

'“alpha'  ralpha>  and  (ralpha-  "alpha*  as  shown  ln  flgure 
2.9-  Class  j  is  a  read-only  transaction  whose  read-set 

intersects  the  write-sets  of  classes  I  and  R.  Suppose 

transactions  i,  j,  and  k  execute  in  classes  I,  3>  and  R 

(respectively)  such  that  k  is  timestamped  before  i  which 

is  timestamped  before  j.  At  DMaipha’  the  following 

sequence  of  events  might  occur:  first  ^glpha 

processed,  then  Rd.  .  is  processed,  then  W  .  .  is 

alpha  alpha 

processed.  In  this  case,  even  though  k  is  timestamped 
earlier  that  i,  from  j's  point  of  view  transaction  i 
precedes  transaction  k,  since  it  sees  i's  update  but  has 
not  yet  seen  k's  update  That  is,  this  interleaved 
execution  requires  that  transaction  i  be  serialized  in 


front  of  transaction  k,  which  is  the  reverse  timestamp 
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situation  that  has  the  same  effect 


The  effect  we  want  to  produce  is  that  if  W1,  .  is 

alpha 

k  i 

timestamped  after  waipha  and  Walpha  is  Processecl  before 

i  k  i 

RJ,  then  W  -i  .  is  processed  before  R^.  u„  as  well, 

alpha’  alpha  alpha 

If  this  condition  is  made  to  be  true  (by  some  protocol) 

i  i  k 

then  R“.  .  cannot  observe  W_,  and  VL,  .  to  execute  in 
alpha  alpha  alpha 

reverse  timestamp  order.  The  protocol  that  has  this 
effect  is  called  P2. 


Protocol  P2  applies  to  a  read  message  RJ,  .  if  and  only 

alpha 

if  there  are  classes  T  and  E  such  that  (w1,  .  .  r^,  .  , 

alpha’  alpha’ 

waipha)  is  a  subpath  in  a  cycle  in  the  conflict  graph 
(where  j  runs  in  class  j).  In  this  case,  we  say  that 


alpha 


must  run  protocol  P2  against  classes  I  and  E  at 


DMalpha’  Pr°bocol  P2  is  used,  then  Ralpha  need  not  run 

protocol  P3  against  I  and  E,  as  would  normally  be 

indicated  by  the  diagonal  edges.  Since  P2  prevents  Ralpha 

from  observing  transactions  in  I  and  E  in  reverse 

timestamp  order,  R^  ,  will  not  interfere  with 

alpha 

serializing  transactions  in  I  and  E  in  timestamp  order,  as 
desired . 


To  run  R;J,  .  under  P2  against  I  and  E,  DM  .  .  must 
alpha  ’  alpha 

ensure  that,  at  the  time  Ry^p^]a  is  processed,  there  is  a 
timestamp  TSq,  such  that  all  WRITE  messages  from  classes  I 
and  E  whose  timestamps  are  less  than  TSq  have  been 
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processed  at  DMaipha*  and  no  WRITE  messages  from  classes  I 
and  K  whose  timestamps  are  greater  that  TSq  have  been 
processed.  The  specific  timestamp,  TSq,  is  not  given  by 
the  READ  message  Ra^p^a  but  rather  is  selected  by  DMa^pha- 
As  long  as  there  exists  some  TSq  through  which  WRITES  from 
the  classes  I  and  £  ».ave  been  processed  but  beyond  which 
they  have  not  been  processed,  then  Ralpha  will  only  be 
able  to  observe  transactions  in  classes  I  and  k  to  have 
been  run  in  their  relative  timestamp  order. 


The  implementation  of  protocol  P2  requires  an  extension  to 
the  read  condition  mechanism.  Since  the  DM  is  expected  to 
choose  a  convenient  TSq  (cf.  P3  where  the  timestamp  is 
prespecified  in  the  READ  message),  the  timestamping  in  the 
read  condition  cannot  be  determined  until  the  READ  message 
is  processed.  So,  a  named  timestamp  marker  may  be 
supplied  in  place  of  a  particular  timestamp  in  the  read 
condition.  Whenever  a  DM  encounters  a  timestamp  marker  in 
a  read  condition,  it  may  choose  an  appropriate  time 
itself,  with  the  proviso  that  when  two  or  more  read 
conditions  are  given  for  a  single  READ  message,  all 
timestamp  markers  with  the  same  name  must  be  assigned  the 
same  timestamp  value. 

For  to  run  P2  against  classes  I  and  E ,  R^.  .  's 

alpha  °  ’  alpha 

READ  message  must  include  two  read  conditions,  <TSM,  !> 
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and  <TSM , 

j>,  where 

TSM  is  a 

timestamp 

marker . 

By 

satisfying 

the  read 

conditions , 

DMalpha 

fulfills 

the 

protocol  P 2  condition  against  classes  I  and  R,  as  desired. 

It  is  interesting  to  note  that  protocol  P2  is  strictly 

weaker  than  P3  in  the  following  sense.  If  Rglpha  runs  P3 

against  classes  I  and  R  at  DM  .  .  ,  then  R;?,  .  satisfies 

°  alpha’  alpha 

the  P2  constraint  against  I  and  R  as  well.  The  converse 
is  not  true.  Since  P2  always  permits  more  concurrency 
than  P3,  it  is  always  advantageous  to  run  P2  in  place  of 
P3  where  ever  possible. 


An  example  will  illustrate  the  use  of  protocol  P2. 

Suppose  there  are  two  data  items  of  interest,  x  and  y, 

which  reside  at  both  DMalpha  and  DMbeta>  initially  x=0  and 

y=0.  We  assume  there  is  an  integrity  constraint  requiring 

that  y<x  .  Three  classes  have  been  defined.  Class  I  runs 

at  TM„.  .  reads  x  from  DM„1r>v,„  and  writes  x.  Class  3 
alpha’  alpha  ** 

runs  at  TM  ,  .  ,  reads  x  from  DM„,  and  writes  y.  Class 

alpha’  alpha  J 

R  runs  at  TMbeta»  and  reads  x  and  y  from  DMbgta.  A  class 
conflict  graph  for  this  configuration  is  shown  in  figure 


2.10.  Notice  that  a  cycle  is  present  and  that 
transactions  in  class  j  must  run  P3  against  transactions 


in  class  T  at 
must  run  protocol 


R 
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Transaction 
Module : 

Readset : 

Writeset : 


from  DM 


from 


TM* 

fx,y) 


from  D1 


Craph : 


Figure  2.10  -  Class  Conflict  Graph  for  Example  in  Section  2.13 


Transaction  i  is  received  at  TMglpha,  requests  to  perform 
the  computation  x  :=  x+1 ,  is  assigned  to  class  I,  and  is 
given  timestamp  TS^.  Transaction  j  is  received  at 
™alpha’  reQuests  to  perform  y  :=  x2,  is  assigned  to  class 
3,  and  is  given  timestamp  TS^TSj.  Transaction  k  is 
received  at  TMbeta>  requests  to  print  the  values  of  x  and 
y  on  the  user's  terminal,  is  assigned  to  class  E ,  and  is 
given  timestamp  TSk>TS ^ .  Notice  that  each  of  these 
transactions  preserve  the  constraint  that  y  <  x2.  No 
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serial  ordering  of  the  transactions  could  invalidate  this 
condition . 

First,  we  consider  an  anomalous  scenario  in  which 
transaction  k  does  not  run  protocol  P2  as  is  required: 

1 .  TMaipha  sends  a  READ  message  to  DMaipha  for 
transaction  i  and  retrieves  x=0. 


2-  ™alpha  sends  WRITE  fflessages  t0  DMalpha  and  DMbeta 
for  transaction  i.  Each  WRITE  message  contains 

timestamp  T3.^  and  the  assignment  x  :=  1. 


3.  DMaipha  processes  the  WRITE  for  transaction  i  (but 

DM.  .  has  not  yet  done  so), 
beta 


4.  TMalpha  sends  a  NULLWRITE  message  for  class  I  with 
timestamp  TSif  >  TSj  to  DMaipha* 


5.  ™aip^a  sends  a  READ  message  to  DMaipha  f°r 
transaction  j  and  retrieves  x=1.  (The  P3  read 
condition  on  this  READ  message  is  immediately 
satisfied  because  of  the  previously  received 
NULLWRITE  message.) 


6*  ™alpha  sends  WRITE  messages  t0  DMalpha  and  DMbeta 
for  transaction  j.  Each  WRITE  message  contains 

timestamp  TS.  and  the  assignment  y  :=  1. 

J 
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7.  DMaipha  processes  j's  WRITE  message, 

8.  DMj3eta  processes  j's  WRITE  message. 


9.  ^beta  sends  a  READ  message  to 
transaction  k,  retrieving  x  =  0,  y=1. 


10.  Transaction  k  prints  x  =  0,  y=1  on 

terminal . 


DMbeta  for 

the  user's 


11.  DMbeta  processes  the  WRITE  message  from  i, 
thereby  setting  x=1. 

The  user  has  seen  an  impossible  state  of  the  database 
(i.e.,  x=0,  y=1)  printed  by  transaction  k,  with  y  >  x  . 
The  problem  is  that  k  is  reading  both  the  input  and  output 
of  another  transaction,  j.  However,  k  is  reading  the  new 
value  of  the  output  but  an  old  value  of  the  input  on  which 
that  output  is  based. 

If  k  had  run  protocol  P2  as  required,  then  this  situation 
could  not  have  occurred.  By  replacing  steps  (9)— (11)  with 
the  following,  we  obtain  a  correct  scenario  in  which  k 
satisfies  P2. 

9.  ™beta  sends  a  READ  message  to  DMbeta  for 
transaction  k.  The  P2  read  condition  requires  that 
WRITE'S  from  classes  I  and  3  -e  processed  through 
some  common  time.  Now  3  -s  been  processed  through 
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time  TS ,  but  class  I  has  not  been  processed  through 

J 

that  time  yet. 


10.  DMbeta  processes  the  WRITE  message  for  i 


11.  A  NULLWRITE  message  arrives  at  DM.  .  for  class  I 

D6  u a 


with  timestamp  TSi,>TSj. 


12.  DMbeta  can  now  process  the  READ  message  from  k, 
since  WHITE'S  from  both  I  and  j  have  been  processed 
through  time  TS j .  It  retrieves  x=1,  y=1. 


13.  Transaction  k  prints  x=1,  y=1  at  the  user's 

terminal . 


Notice  that  it  was  not  necessary  for  transaction  k  to  use 

protocol  P3  to  obtain  a  correct  result.  It  only  had  to 

wait  until  WRITE's  from  classes  I  and  j  had  been  processed 

through  time  TS . ,  not  through  time  TS.  (its  own 

J  K 

timestamp) . 
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2.14  Protocol  P2f 


Protocol  P2f  is  quite  similar  to  protocol  P2.  It  is  used 
in  cycles  that  contain  a  w-r-e-r-w  subpath  such  as  the 

T  -r  -t  -r  f- 

subpath  (“glpha,  ralpha *  eJ’  rbata'  "beta1  shoun  in  Figure 
2.11.  The  "f"  in  P2f  refers  to  the  fact  that  reading  is 

being  done  from  a  foreign  DM.  As  in  a  P2  subpath,  a 
transaction  in  class  j  is  able  to  observe  an  ordering  of 
transactions  in  classes  I  E;  protocol  P2f  is  designed 
to  ensure  that  the  observed  ordering  is  always  the 
timestamp  ordering  of  the  transactions.  If  the  above 
subpath  is  part  of  a  cycle,  then  each  transaction,  j,  in 
class  j  must  run  P2f  against  I  „„  DMaipha  and  ^  at  DMbeta‘ 
This  means  that  there  must  be  a  timestamp,  say  TSq ,  such 
that  all  WRITE  messages  from  I  timestamped  before  TSq  and 
none  timestamped  after  TSq  are  processed  before  Rg^pha 
DMalpha’  and  WRITE  messages  from  E  timestamped  before 
TSq  and  none  timestamped  after  TSQ  are  processed  before 
Rbeta  at  DMbeta‘  Protocol  P2f  essentially  runs  half  of  P2 
(against  I)  at  one  DM  and  half  of  P2  (against  E)  at 


another  DM. 


r  .  I* 


Since  reading  is  being  done  from  two  separate  DM ' s ,  it  is 
not  possible  to  use  the  timestamp  marker  mechanism.  (If 
timestamp  markers  were  used,  it  would  be  necessary  for  the 
two  DM's  involved  to  carry  on  a  conversation  to  determine 


a  mutually  satisfactory  timestamp  to  substitute  for  the 
marker.  This  kind  of  synchronization  overhead  is  exactly 
what  we  are  trying  to  avoid.)  Instead,  the  TM  issuing  the 


t-v' 


READ  messages  chooses  a  timestamp  (i.e.,  TSq  above)  and 


includes  a  read  condition  on  each  READ  with  this 

timestamp.  That  is,  if  j  must  run  P2f  against  T  at 

DM  i  and  1<  at  DM.  .  ,  then  a  transaction  j  in  class  j 

alpha  beta  ’ 

includes  the  read  condition  <TS  ,  I>  in  RJ,  .  and  <TS  , 

o’  alpha  o’ 
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k>  in  R^eta  for  some  chosen  value  of  TSq  .  Unfortunately, 
choosing  a  TSQ  for  P2f  is  not  quite  as  nice  as  using 
timestamp  markers  in  P2 ,  because  the  P2f  READ  messages 
have  a  greater  likelihood  of  being  rejected  or  having  to 
wait.  The  primary  difference  between  read  conditions 
issued  as  part  of  protocol  P3  and  those  issued  as  part  of 
protocol  P2f  is  that  the  read  condition  timestamp  for 
protocol  P3  must  be  the  same  as  the  timestamp  of  the 
issuing  transaction  while  the  read  condition  timestamp  for 
protocol  P2f  may  have  any  value. 
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2.15  Protocol  P 1 

If  a  transaction  class  appears  in  the  graph  but  does  not 
run  one  of  protocols  P2,  P2f,  or  P3,  then  we  say  it  runs 
protocol  PI.  That  is  to  say,  protocol  PI  is  the  protocol 
that  involves  no  synchronization  other  than  the  data  item 
timestamping  rule  and  the  class  pipelining  rule. 

PI,  P2,  P2f,  and  P3  provide  a  graduated  set  of  mechanisms 
in  terms  of  concurrency  and  synchronization  expense.  A 
goal  in  designing  a  particular  application  is  to 
distribute  the  data  and  define  the  classes  to  use  the 
lower  numbered  protocols  most  frequently. 
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Graph  Topology 

(the  subpath  shown 
is  part  of  a  cycle) 


Protocol  Requirement 


ri  ra 


Transactions  in  class  i  must 
run  protocol  P2  with  respect 
to  classes  j  and  k . 


Transactions  in  class  i  must 
run  protocol  P2-F  with  respect 
to  classes  j  and  k . 


Transactions  in  class  i  must 
run  protocol  P3  with  respect 
to  class  k . 


Transactions  in  class  i  must 
run  protocol  P3  with  respect  to 
class  j  . 


Figure  ?.12  -  Protocol  requirements  are  suggested  by  the 
graph  topology 
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2.16  Pre-Analysis  of  the  Class  Conflict  Graph 


Figure  2.12  summarizes  the  results  so  far,  illustrating 
how  particular  graph  topologies  indicate  that  particular 
protocols  must  be  run. 

If  it  were  necessary  to  compute  graph  edges  and  cycles 
before  executing  each  transaction,  the  cost  of  doing  so 
would  clearly  be  prohibitive.  Fortunately,  this  is  not 
necessary.  The  class  definitions  are  specified  by  a  DBA 
at  application  design  time  and  at  that  time  the  class 
conflict  graph  can  be  computed  and  analyzed.  The  result 
of  such  m  analysis  will  be  a  list  of  read  conditions  for 
each  class.  Note  that  a  class  may  have  more  than  one  or 
two  read  conditions  which  it  must  use.  This  is  because 
the  class  may  be  a  part  of  several  cycles. 

When  a  transaction  is  entered  at  a  TM,  the  TM  first 
determines  its  read-set  and  write-set.  It  then  determines 
to  which  class  that  transaction  belongs  (if  the 
transaction  can  run  in  more  than  one  class,  the  class  with 
the  fewest  synchronization  requirements  is  chosen). 
Having  identified  the  transaction's  class,  only  a  table 
lookup  is  required  to  determine  what  read  conditions  the 


transaction  must  use. 
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2.17  Safe  Cycles 

It  happens  that  there  are  graph  cycles  which  never  cause  a 
non-serializable  interleaving  of  transactions.  In 
particular,  any  cycle  which  does  not  contain  a  vertical 
edge  is  always  safe.  Thus,  a  cycle  composed  entirely  of 
diagonal  edges  or  entirely  of  horizontal  edges  will  never 
lead  to  a  serializability  problem  and  classes  lying  on 
such  cycles  can  safely  run  PI  (at  least  insofar  as  the 
safe  cycles  are  concerned).  The  cycle  shown  in  Figure 
2.13  is  an  example  of  a  safe  cycle. 
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This  result  is  not  immediately  apparent  through  intuitive 
understanding  and  is  illustrative  of  the  fact  that  a  more 
formal  and  precise  treatment  of  serializability  criteria 
is  needed. 


(Some  intuitive  understanding  can  be  gained,  however, 
through  the  following  arguments.  First,  if  the  cycle 
consists  entirely  of  horizontal  edges  then  a 
serializability  problem  cannot  arise  because  horizontal 
edges  always  imply  a  timestamp  ordering  of  the 
transactions.  Second,  if  the  cycle  consists  entirely  of 
diagonal  edges  then  all  the  nodes  on  the  cycle  have  the 
same  DM  subscript.  Also  such  a  cycle  consists  of  a  series 
of  W-R-W  subpaths.  Remember  from  the  discussion  of 

protocol  P2  that  on  such  a  subpath  the  reading  transaction 
may  observe  a  particular  ordering  of  the  writing 
transactions  and  that  the  observed  ordering  depends  on  the 
actual  order  in  which  the  WRITES  were  processed  by  the  DM. 
Since  all  of  the  WRITES  on  the  cycle  are  being  processed 
by  the  same  DM,  it  must  be  the  case  that  the  reading 
transactions  all  observe  the  same  relative  ordering  among 
the  writing  transactions  and  hence  all  transactions  on  the 
cycle  will  be  serializable.) 
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2.18  Summary  and  Conclusions 

In  reviewing  the  concepts  presented  in  section  2,  it  is 
helpful  to  distinguish  between  three  kinds  of  properties 
of  an  SDD-1  system: 

1.  properties  that  are  intrinsic  to  the  way  the  SDD-1 
software  operates; 

2.  properties  that  arise  from  database  design 
decisions . 

3.  properties  that  arise  from  the  analysis  of  the 
database  design. 

In  category  (1)  are  the  way  data  modules  process  READ 
messages  and  WRITE  messages,  the  way  clocks  operate,  the 
pipelining  rules,  and  the  way  each  protocol  works.  In 
category  (2)  are  the  choice  of  the  location  of  SDD-1  sites 
on  the  network,  the  choice  of  logical  fragments,  the 
location  of  physical  fragments,  the  configuration  of 
materializations,  the  choice  of  read-sets  and  write-sets 
for  each  class,  and  the  assignment  of  materializations  to 
each  class.  Finally,  in  category  (3)  is  the  assignment  of 
protocols  to  each  class. 
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3*  Selection  and  Analysis  of  Protocols 


3.1  Logs 


To  develop  the  criteria  for  selecting  a  protocol  for  each 
class,  we  need  a  formal  model  for  transaction  processing. 
The  model  we  have  chosen,  called  logs ,  consists  of  a 
string  of  symbols  that  represents  the  execution  of 
transactions,  READ  messages,  and  WRITE  messages.  Our 
claim  will  be  that  logs  embody  all  of  the  information 
about  system  execution  that  is  needed  to  reproduce  its 
input-output  behavior.  Verifying  this  claim  will  permit 
us  to  use  logs  as  a  formal  model  for  investigating  other 
aspects  of  the  behavior  of  SDD-1. 


There  are  three  kinds  of  events  that  are  of  interest  for 
building  logs  :  READ  messages,  WRITE  messages,  and  local 
transaction  execution.  We  represent  the  processing  of  a 
READ  message  for  a  transaction,  a,  at  a  data  module, 
alpha,  by  Rg^pha*  We  represent  the  processing  of  a  WRITE 
message  for  a  transaction,  a,  at  a  data  module,  alpha,  by 
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Wa  .  .  Finally,  we  represent  the  local  execution  of  a 
alpha 

transaction  (in  its  transaction  module),  a,  by  Ea.  In  the 
sequel,  we  will  use  lower  case  Roman  letters  near  the 
beginning  of  the  alphabet  to  represent  transactions,  and 
lower  case  Greek  letters  near  the  beginning  of  the 
alphabet  to  represent  data  modules. 

The  behavior  of  each  data  module  is  modelled  as  a  string 
of  R's  and  W's,  which  represents  the  order  in  which  READ 
messages  and  WRITE  messages  were  processed  (as  opposed  to 
received)  by  the  data  module.  We  call  such  a  string  a 

local _ data  module  log.  Each  local  data  module  log  must 

obey  certain  syntactic  constraints  that  represent  physical 
properties  that  data  modules  must  satisfy.  In  a  local 
data  module  log,  say  for  data  module  alpha,  the  following 
must  hold: 

D1 .  All  R's  and  W's  must  have  the  same  subscript, 
alpha,  since  they  are  all  processed  at  data  module 
alpha . 


D2.  For  each  transaction,  a,  at  most  one  R 
,a 


alpha 

and  one  W“  .  can  appear,  since  each  transaction 
alpha  r 

can  send  at  most  one  READ  and  one  WRITE  message  to 
any  given  data  module. 
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The  behavior  of  classes  is  modelled  as  a  string  of  E's 
called  a  global  transaction  log,  which  represents  the 
order  in  which  transactions  were  executed  as  reflected  by 
their  timestamps.  The  only  syntactic  constraint  on  a 
global  transaction  log  is 

El.  For  each  transaction,  a,  only  one  Ea  appears, 

since  a  transaction  receives  only  one  timestamp. 

A  global  transaction  log  induces  certain  additional 
syntactic  restrictions  on  a  local  data  module  log,  which 
indicate  the  proper  orderings  based  on  the  pipelining 
rules.  In  a  local  data  module  log,  say  for  data  module 
alpha,  the  following  must  hold:  if  transaction  a  and 

transaction  a'  run  in  the  same  class  and  Ea  ^recedes  Ea 
in  the  global  transaction  log,  then 

D3.  (R-R  pipelining)  If  Ralphg  and  Raipha  appear, 

3  a 1 

then  R  ,  .  precedes  R  ,  .  : 

alpha  alpha’ 

D4 .  (W-W  pipelining)  If  Wa.  .  and  Wa'  .  appear, 

alpha  alpha 

then  Wa  .  precedes  Wa  .  ; 

alpha  K  alpha’ 

D5.  (W-R  pipelining)  If  Wglphg  and  Raipha  appear, 

then  precedes  R?n'  u„. 
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A  log  models  an  execution  history  of  transactions  on  the 
database. 

To  obtain  a  complete  picture  of  the  effect  that  logs  have 
on  the  database,  we  require  the  following  additional 
information,  relating  to  database  design: 

for  each  transaction  -  the  read-set  of  the  transaction, 
the  write-set  of  the  transaction,  and  the  class  in 
which  the  transaction  ran; 

for  each  data  module  -  the  set  of  physical  fragments 
that  is  stored  there;  and 

for  each  class  -  the  materialization  it  uses  for 
reading . 

For  the  sake  of  economy  of  the  model  and  to  enhance 
mathematical  tractability ,  we  will  normally  leave  the 
transactions  un interpreted  (in  the  sense  of  the  program 
schema  theory  [Manna]).  That  is,  for  each  logical  data 
item  in  the  write-set  of  each  transaction,  we  associate  a 
unique  un inter preted  function  letter  that  maps  all  of  the 
read-set  into  that  write-set  data  item. 

Given  the  above  database  design  information,  we  must  add 
two  more  syntactic  constraints  on  local  data  module  logs 
that  guarantee  that  all  of  the  relevant  READ  and  WRITE 
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messages  are  actually  issued.  If  Ea  -ppears  in  the  global 
transaction  log,  then 


1.  If  some  _ata  item  in  the  read-set  of  transaction  a 
is  obtained  by  the  materialization  of  the  class 
under  which  transaction  a  runs  from  data  module 
alpha,  then  appears  in  alpha’s  local  data 

module  log. 


2.  If  some  data  in  the  write-set  of  transaction  a  is 
stored  at  data  module  alpha,  then  walpha  appears  in 
alpha's  local  data  module  log. 


In  addition  to  these  syntactic  constraints,  there  is  the 
obvious  semantic  constraint  that  the  logs  accurately 
represent  the  order  in  which  R's  and  W's  (in  the  case  of  a 
local  data  module  logs)  or  E's  (in  a  global  transaction 
log)  actually  were  processed. 

Suppose  we  have  a  global  transaction  log  and  a  collection 
of  local  data  module  logs  that  represent  the  execution  of 
the  system  during  some  period.  These  logs  can  be  merged 
into  a  single  global _ system  _log  by  satisfying  the 


following  conditions: 
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G1.  All  symbols  in  the  global  transaction  log 

appear  in  the  global  system  log  and  appear  in  the 

3  b 

same  order  (e.g.,  if  E  precedes  E  the  global 

3  b 

transaction  log,  then  E  and  E  appear  in  the 

3  b 

global  system  log  and  E  precedes  E  )  . 


G2 .  For  each  local  data  module  log,  all  symbols  in 
the  local  log  appear  in  the  global  system  log  and 
appear  in  the  same  order. 


G3.  For  each  transaction,  a,  and  for  each  data 
module,  alpha,  if  Ralpha  appears  in  the  global 
system  log  then  Ea  also  appears  in  the  global 
system  log  and  Rglpha  precedes  Ea . 


G4 .  For  each  transaction,  a,  and 
module,  alpha,  if  ^aipha  appears 
system  log  then  Ea  also  appears 
system  log  and  Ea  precedes  Wa^pha> 


for  each  data 
in  the  global 
in  the  global 


Given  a  global  log  and  its  associated  database  design 
information,  we  would  like  to  show  that  this  model  is 
sufficiently  powerful  to  reproduce  the  essential  aspects 
of  SDD-1  operation. 
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Claim  C  The  log  model  of  SDD-1  operation  is  complete  in 
the  sp;  se  that  given  an  initial  value  for  all  data  items 
in  the  database,  a  log,  and  an  interpretation  of  the 
function  symbols  for  transactions,  then  there  is  a 
mechanical  procedure  that  could  analyze  the  log  and 
reproduce  the  exac.  value  history  of  each  stored  data  item 
at  each  data  module. 

The  essence  of  claim  of  C  is  that  timestamping  information 
for  transactions  and  the  parameters  of  READ  and  WRITE 
messages  are  not  needed  in  order  to  duplicate  the  actual 
operation  of  the  system,  given  that  the  log  and  associated 
transaction  and  data  distribution  information  is  provided. 
To  prove  this  claim  formally,  we  would  need  a  formal  model 
for  the  operation  of  SDD-1  (at  the  level,  say,  of  a  RAM  or 
Turing  machine)  and  a  formal  model  of  logs.  Then  we  would 
need  to  show  an  isomorphism  between  the  value  histories  of 
all  stored  data  items  of  each  model.  We  will  not  perform 
this  tedious  task.  Father  we  will  demonstrate  an 
interpreter  that  Can  simulate  SDD-1 's  operation  with  only 
the  information  available  in  logs  and  the  associated 
transact) r. i  and  data  distribution  information.  We  argue 
along  intuitive  lines  only  that  the  interpreter  is  indeed 
simulating  correctly. 
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The  interpreter  maintains  a  simulated  physical  copy  of 
each  stored  data  item  in  each  data  module.  Instead  of 
storing  a  timestamp  with  each  stored  data  item,  we 
associate  the  transaction  name  of  the  last  transaction 
that  successfully  wrote  into  that  data  item.  Given  the 
total  ordering  of  E's  in  the  global  system  log,  this 
"transaction  label"  will  be  sufficient  to  reproduce  all  of 
the  essential  timestamping  information  in  the  system. 

Given  the  database  design  information,  we  can  obtain  the 
read-set  and  write-set  associated  with  each  R  and  W  in  the 
log.  We  also  assume  that  for  each  un interpreted  function 
letter  in  a  transaction  there  is  an  interpretation  (i.e.  a 
program)  . 


Now,  to  execute  a  global  system  log,  the  interpreter 
begins  by  initializing  all  stored  data  items  to  their 
initial  state  and  their  associated  transaction  labels  to 
NULL.  It  then  selects  log  symbols,  one  at  a  time 
proceeding  from  left  to  right;  for  each  symbol  it  does  the 
fol lowing : 


i.  If  the  symbol  is  a  read,  say  R‘ 


then  read 


alpha ’ 

that  portion  of  the  read-set  of  transaction  a  that  is 
stored  at  data  module  alpha  according  to  the 
materialization  of  the  class  in  which  transaction  a 


executes.  Store  these  values  in  a  temporary  work 
space  associated  with  transaction  a. 
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ii.  If  the  symbol  is  an  E,  say  Ea ,  execute  the 
interpretation  of  transaction  a  on  the  read-set 
valuer  stored  in  its  workspace.  The  resulting 
write-st*'  values  should  be  stored  back  into  its 
workspace . 

iii.  If  the  symbol  is  a  write,  say  then  for 

each  data  item  in  the  write-set  of  transaction  a  that 

also  is  stored  at  data  module  alpha,  take  the  value 

of  the  data  item  and  store  it  in  the  stored  data  item 
at  alpha  with  transaction  label  =  a  if  and  only  if 
one  of  the  following  holds: 

1 .  the  transaction  label  for  the  data  item  at 

alpha  is  NULL;  or 

2.  the  transaction  label  for  the  data  item  at 

K  g 

alpha  is  some  b  where  E  precedes  E  in  the 
global  system  log. 

First,  notice  that  me  parameters  (i.e.  conditions)  of 
read  messages  are  not  needed,  in  that  the  global  system 
log  already  specifies  exactly  which  WRITE  messages  are 
processed  ahead  of  each  READ  message.  Second,  the 
conditions  for  perf arming  WRITE  messages  are  exactly  those 
induced  by  the  timestamping  rules.  The  use  of  ordered  E's 
l  r.  the  log  to  embody  timestamping  information  is  a  crucial 
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conceptual  simplification  that  makes  the  proofs  in  later 
sections  possible.  Were  we  forced  to  use  actual 
timestamps  instead,  the  notation  would  be  much  more 
difficult  to  understand  and  manipulate. 


3.2  Correctness  Criteria 


To  determine  how  to  assign  protocols  to  classes  to  yield 
correct  system  operation,  we  must  first  develop  precise 
conditions  for  correct  system  operation.  We  define  two 
conditions  that  characterize  the  correctness  of 
distributed  database  systems  such  as  SDD-1 .  One 
condition,  called  convergence ,  states  that  all  copies  of 
each  logical  data  item  must  be  "converging"  toward  the 
same  value.  The  other  condition,  called  serial 
reproducibility,  essentially  states  that  the  values  toward 
which  the  database  is  converging  are  mutually  consistent. 
We  proceed  more  formally  with  a  discussion  of  each  of 


these  criteria. 
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3.2.1  Con'ergence 


A  log  is  convergent  if,  given  a  database  state  in  which 
all  stored  copies  of  each  data  item  are  equivalent,  then 
the  log  transforms  that  state  into  another  state  with  the 
same  property.  (In  the  sequel,  we  use  "log"  to  mean 
"global  system  log".)  A  system  is  convergent  if  all  of 
the  logs  it  can  generate  are  convergent.  One  way  to  look 
at  system  convergence  is  to  imagine  that  if  the  processing 
of  E's  were  to  stop  at  any  time  and  all  WRITE  messages  for 
completed  E's  were  processed,  then  the  resulting  log  would 
be  convergent. 

Theorem _ CONV  Let  L  be  a  log  generated  by  SDD-1.  If  for 

each  E  in  L  all  of  E's  WRITE  messages  are  in  L,  then  L  is 
convergent . 


Proof  Consider  an  a*"bit.  ary  logical  data  item,  x,  and  let 
Ea  be  the  last  transaction  execution  which  has  x  in  its 
write-set.  Since  all  write  messages  for  transaction  a  are 
eventually  processed  (by  hypothesis),  for  each  data 


me  ule,  alpha,  that  has  a  stored  copy  of  x,  W‘ 


will  be 


alpha 

the  last  WRITE  message  in  L  that  successfully  updates  x. 
Hence,  all  copies  of  x  will  be  equivalent.  Q.E.D. 
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Corollary  SDD-1  is  convergent. 


3 .2.2  Serial  Reproducibility 


We  define  two  logs,  LI  and  L2 ,  to  be  equivalent  if  for  all 
initial  database  states  and  for  all  interpr etations  of  the 
transactions,  LI  and  L2  leave  the  database  in  the  same 
final  state.  In  a  log  L,  we  say  that  a  READ  message 

Ralpha  reads  from  a  message  Walpha  if 


i.  There  is  a  stored  data  item  x  at  alpha  that  is 
in  the  read-set  of  a  and  the  write-set  of  b;  and 


“•  “alpha  Ralpha  in  L;  and 

iii.  Walpha  successfully  updates  x 

processed  (i.e.,  Eb  appears  later  in 

,b 


current  transaction  label  when  W 


alpha 


and 


when  it  is 
L  than  x's 
is  processed )  ; 


There  is  no  c  such  that  W 


alpha 


IV 

Walpha  and  precedes  Ralpha  in  L, 

writes  into  x  (i.e.,  Wb  ,  is 

apha 

operation  into  x  before  Rapha^ 


follows  Wb  Q, 
alpha 

and  W  successfully 
the  last  write 


The  notion  of  "reading  from"  character izes  log  equivalence 
in  the  following  sense. 
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Theorem  E  Let  LI  and  L2  be  logs  that  contain  the  same  set 
of  transactions.  If  every  R  reads  each  of  its  data  items 
from  the  same  W  in  both  LI  and  L2 ,  then  LI  is  equivalent 
to  L2 . 

Proof  The  proof  uses  Herbrand  interpretations  to  show  that 
each  data  item  displays  the  same  final  value  in  both  logs. 
This  is  a  standard  program  schema  theoretic  result  and  can 
be  found  (for  example)  in  [MANNA]. 

Theorem  E  can  be  extended  to  be  both  a  necessary  and 
sufficient  condition  for  equivalence  by  incorporating  the 
notion  of  "deadness"  as  in  [Papadimitriou  et  al]. 
However,  for  later  results,  we  only  need  the  sufficient 
condition  for  equivalence. 


We  define  a  log  to  be  serial  if  for  each  transaction  a  in 
the  log,  all  Ra  symbols  immediately  precede  Ea  and  all  Wa 
symbols  immediately  follow  Ea .  That  is,  a  serial  log  is 
of  the  form: 


R  R  -  hi  Ua  Ru 

alpha'^'^omega11  alpha'* ,womegaKalpha‘ 

■ . L  D c  Dc  rc  ,.c 

•  .  .  w  R  ,  ,  .  .  .  R  EW-., 

omega  alpha  omega  alpha 


omega  alpha 


.  .  W 


c 

omega  ’ *  * 


A  log _ is  serially  reproducible  if  it  is  equivalent  to  a 

serial  log.  A  system  is  safe  if  all  of  the  logs  it  can 
generate  are  serially  reproduc ible .  The  use  of  serial 
reproducibility  as  a  correctness  criterion  has  been  used 
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by  many  researchers  [ ESWARN  et  al],  [GRAY  et  al],  and 
[HEWITT]  and  arises  from  the  following  model.  Our  goal  is 
to  show  that  the  database  is  maintained  in  a  "consistent" 
state,  where  "consistency"  is  character ized ,  say,  by  a 
predicate  which  is  true  for  all  consistent  states.  We 
assume  that  every  transaction  preserves  the  consistency  of 
the  database:  given  a  copy  of  its  read-set  that  is 
consistent  then  it  will  produce  a  copy  of  its  write-set 
that  is  also  consistent.  Clearly,  every  serial  log 
preserves  database  consistency  if  each  of  its  transactions 
preserves  database  consistency;  in  this  case,  all  data 
items  are  updated  cosynchronousl y ,  because  all  WRITE 
messages  of  a  transaction  are  processed  before  the  next 
READ  message  is  processed.  Since  a  serially  reproducible 
log  is  equivalent  to  a  serial  log,  serially  reproducible 
logs  preserve  consistency  as  well. 

SDD-1  guarantees  serial  reproduc ibil ity  by  the  rules  that 
govern  the  selection  of  protocols  for  classes.  That  is, 
if  every  class  executes  all  of  its  transactions  according 
to  the  prespecified  protocols,  then  the  log  of  all 
transactions  executed  by  all  classes  is  serially 
reproduc ible .  In  the  remainder  of  Section  3  we  will 
develop  these  protocol  selection  rules.  In  Section  4  we 
will  prove  that  they  do  in  fact  make  SDD-1  logs  serially 
reproduc ible . 
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3-3  Log  Transformations 


To  determine  if  a  log  is  serially  reproducible ,  we  will 
define  an  effective  procedure  to  transform  a  log  into  an 
equivalent  serial  one.  The  procedure  is  based  on 
equivalence  preserving  tr ansformations  on  logs.  These 
transformations  are  in  the  form  of  "switching  rules", 
i.e.,  equivalence  preserving  rules  for  switching  adjacent 
log  symbols.  Each  of  the  following  switching  rules  is  of 


the  form  "...  xl  x2  ...  =  ...  x2  xl 


under  condition 


C",  which  means  that  if  symbols  xl  and  x2  are  adjacent  in 
a  log  and  they  satisfy  condition  C,  then  they  can  be 
switched  and  the  resulting  log  is  equivalent  to  the  log 
before  the  switch. 


TR 1 .  . . .  Ra.  .  R?  .  ...  = 

alpha  beta  - 


p  L  p  3 

• • '  “beta  “alpha 


...  where  a  and  b  run  in  different  classes 


TR2 .  .  . .  RP  ,  R^  .  ...  = 

alpha  beta  - 

. . .  where  alpha  i  beta 


pb  p^ 

“beta  “alpha 
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T83*  . . .  Ea  Eb  . . .  _=  . . .  Eb  Ea  . . .  where  a  and  b 
run  in  different  classes  and  have  nonintersecting 
write-sets . 


TR4.  ...  Wa  .  w  ,  -  wb  Wa 

alpha  alpha  **•  -  *•*  "alpha  walpha 

...  if  a  and  b  run  in  different  classes 


TR5.  ...  Wglpha  Wbeta  . . . 
if  alpha  /  beta 


Wb  wa 
• ' •  "beta  walpha 


TR6.  ...  Ra  .  w°  4 
alpha  beta 

if  alpha  i  beta 


Wb  r  a 

beta  “alpha 


W°  -  U13  D3 

alpha  * • '  -  • • •  walpha  Ralpha 

run  in  different  classes  and  there 

is  no  stored  data  item  at  alpha  that  is  common  to 
transaction  a's  read-set  and  transaction  b's 
write-set . 


TR7  . 


if 


,  Rd.  . 
alpha 

a  and  b 


Theorem  TR  The  transformations  TR 1  -  TR7  are  sound,  i.e., 
they  preserve  log  equivalence. 

Proof  Follows  directly  from  theorem  E  and  the  definitions 
of  the  transformations .  Q.E.D. 
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We  note  in  passing  that  the  tr ansformations  TR1  -  TR7  are 
in  no  sense  complete  with  respect  to  equivalence.  That 
is,  given  two  equivalent  logs,  LI  and  L2,  there  may  be  no 
sequence  of  applications  of  TR1-TR7  to  LI  that  yields  L2. 
There  are  several  reasons  for  this.  First,  all  of  the 
transformations  prese-ve  the  pipelining  rules  in  addition 
to  equivalence,  which  thereby  weakens  them.  Second,  the 
transformations  preserve  certain  timing  information,  which 
in  some  cases  is  not  needed  to  preserve  equivalence. 
Finally,  pairwise  switching  is  not  sufficient  to  handle 
all  equivalence  situations;  logs  can  be  constructed  which 
have  entire  sublogs  that  can  be  switched  in  an  equivalence 
preserving  way,  such  that  no  sequence  of  pairwise  switches 
can  reproduce  the  sublog  switch.  These  observations  are 
parenthetical  to  the  results  that  follow,  since  the 
soundness  of  TR 1  -  TR7  is  all  that  is  required. 
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3-4  Conflict  Graphs 


From  TR1  -  TR7  we  can  derive  the  set  of  invalid  switches , 
i.e.,  those  switches  that  are  not  permitted  by  TR1  -  TR7 . 
These  invalid  switches  correspond  to  potential  conflicts 
between  transactions  and,  as  we  will  see,  can  lead  to 
non-serially  reproducible  logs.  The  invalid  switches, 
called  conflicts ,  are: 

NTR1*  **•  Ralpha  Ralpha  where  a  and  b  run  in 

the  same  class. 

NTR2.  ...  Walpha  Wblpha  ...  where  a  and  b  run  in 
the  same  class. 

NTR3.  ...  Ralpha  waipha  *•*  or  •**  Walpha  Ralpha 
where  either  a  and  b  run  in  the  same  class  or 

there  is  a  stored  data  item  at  alpha  that  is  common 

to  transaction  a's  read-set  and  transaction  b's 


write-set . 
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NTR4 . 


.  R 


a 

alpha 


E 


a 


NTR5. 


W 


a 

alpha  '  ‘  ’ 


o  K 

NTR6.  ...  E  E  where  a  and  b  run  in  the  same 
class  or  have  intersecting  write-sets. 


It  is  easily  checked  that  these  are  the  only  pairs  that 
cannot  be  switched  using  TR1  -  TR7 . 


The  above  conflicts  can  be  modelled  by  a  node-labelled 
undirected  graph  whose  nodes  represent  generic  log  symbols 
and  whose  edges  represent  potential  conflicts  between  log 
symbols.  The  graph  is  defined  over  a  finite  set  of 
classes,  denoted  { I ,  E , c  ,  .  .  .  z }  ,  and  associated  with  each 
class  is  a  read-set,  a  write-set,  and  a  materialization. 


We  define  a  conflict  graph  CG  =  <V,E>  as  follows  (it 
denotes  set  union): 


V  =  {e:  all  classes  a}  +  {r 


alpha " 


all 


and  all  data  modules  alpha}  +  {wc 
5  and  all  data  modules  alpha} 


lpha ' 


classes  a 
all  classes 


► 

^  ~  ^vert  +  ^horiz  +  Ediag 
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Evert  =  {(ralpha’  e  ) 
modules  alpha}  +  { ( e  , 

all  data  modules  alpha} 


w 


all  classes  a  and  all  data 

a,  ,  )  :  all  classes  i  and 

alpha 


a  K  mm  m 

E,  .  =  {(e  ,  e  )  :  all  classes  a,b  where  the 

horiz  ’  ’ 

write-sets  of  I  and  5  have  a  nonempty  intersection} 

F  -  f  (  wb  _  _ 

diag  1  alpha’  alpha)  :  all  classes  a,b  and  all 

data  modules  alpha  such  that  the  portion  of  5's 

write-set  stored  at  alpha  has  a  nonempty 

intersection  with  the  portion  of  a's  read-set  which 

is  stored  at  alpha  under  a’s  materialization} 


The  notions  of  vertical,  horizontal  and  diagonal  edges 
derive  from  the  following  convention  for  drawing  conflict 
graphs.  For  each  class  a,  we  draw  all  of  a's  r  nodes  in  a 
row,  beneath  which  we  uraw  i's  e  node,  beneath  which  we 
draw  a's  w  nodes  in  a  row.  (See  figure  2.2.)  The  Evgrt 
edges  connect  each  e  to  all  of  its  r's  and  w's;  these 
edges  a^e  (in  a  manner  of  speaking)  vertica.  Groups  of 
nodes  for  different  classes  are  arranged  in  a  row  (see 
figure  2.3).  The  Ehoriz  edges  connecting  e's  in  different 
classes  are  therefore  horizontal,  and  the  ^diag  eclBes 
connecting  an  R  and  W  from  different  classes  are  diagonal. 
We  have  found  these  conventions  to  be  very  convenient  when 
discussing  conflict  graphs. 
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3.5  Protocol  Selection  Rules 


A  conflict  graph  'ycle  that  contains  a  vertical  edge  can 
lead  to  a  nonserializable  log,  because  the  edges  of  the 
cycle  can  correspond  to  conflicting  (and  hence 
unswitchable)  symbols  in  the  log.  The  rules  for  selecting 
which  protocols  to  use  for  each  READ  message  in  each  class 
|  are  built  around  cycles  in  the  conflict  graph.  We 

conclude  Section  3  by  enumerating  these  rules.  In  Section 
4  we  prove  that  if  all  transactions  obey  these  rules,  then 
all  logs  are  serially  reproducible.  The  rules  are: 


PSR3-  If  raipha  lies  on  a  c  e  in  the  conflict  graph  and 
the  cycle  contains  the  subpath  <w^lpha,  r*lpha,  ea,  waeta) 
or  the  sub  (wgipha’  ralpha’  e3’  eC^  for  some  classes  5  and 
c  and  some  data  module  beta,  then  for  each  transaction  a 


in  a,  run  R 


alpha 
a 


under  protocol  P3  with  respect  to  5. 


PSR2F.  If  ralpud  and  rbgta  lie  on  a  cycle  in  the  conflict 

graph  and  tne  cycle  contains  the  subpath  ( w^1  .  ,  r®  ^  , 

_  y  beta’  beta’ 

a  a  c 

e  r  w  _  _ 

!  alpha’  alpha)  for  some  classes  5  arid  c,  then  for 

each  transaction  a  in  a,  run  Ra.  ,  and  R?  .  under 

alpha  beta 

protocol  schema  P2F  against  5  „t  beta  and  against  c  at 
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P3R2.  If  r31pha  lies  on  a  cycle  in  the  conflict  graph  and 
the  cycle  both  contains  a  vertical  edge  and  contains  the 


b  3 

subpath  (w  ,  ,  ,  r  ,  ,  , 

alpha’  alpha’ 


w  ,  ,  )  for  some  b  and  c,  then 

alpha  ’ 


for  each  transaction  a  in  a,  run  Rg]_p^a  under  protocol  P2 
against  5  and  c  at  alpha. 


These  protocols  must  be  satisfied  for  all  cycles  in  the 
conflict  graph.  That  is,  if  an  r  lies  on  several  cycles 
and  thereby  satisfies  several  of  the  PSRs,  then  that  READ 
message  must  include  conditions  to  satisfy  all  of  its 
PSRs.  If  an  r  satisfies  none  of  the  above  PSRs,  either 
because  it  lies  on  no  cycles  or  because  none  of  the  cycles 
on  which  it  lies  have  the  undesirable  properties,  then 
that  r  can  run  protocol  PI.  It  is  expected  that  under  a 
suitable  database  design  and  for  many  applications,  most 
transactions  need  only  run  under  protocol  schema  PI. 

Theorem  SR  If  all  of  the  transactions  in  a  log  use  the 
correct  protocol  as  outlined  by  the  protocol  selection 
rules,  then  the  log  is  serially  reproducible. 


Proof  See  Section  4. 


Corol  la'-y  SDD-1  is  safe. 


Page  -92- 
Section  4 


SDD-1  Concurrency  Control  Mechanism 
Proof  of  Serial  Reproducibility 


4.  Proof  of  Serial  Reproducibility 


4.1  Introduction 

This  section  contains  a  proof  of  theorem  SR,  which 
demonstrates  that  the  SDD-1  protocol  selection  rules  lead 
to  serially  reproducible  logs.  Since  the  proof  is  rather 
long  and  its  details  may  not  be  of  interest  to  all 
readers,  we  will  first  present  a  brief  overview  of  the 
proof.  To  prove  the  theorem  formally,  we  need  to 
formalize  the  concepts  of  the  previous  sections.  This 
formalism  is  presented  in  Section  4.2.  The  proof  itself 
comes  in  two  parts  and  is  presented  in  Sections  4.3  and 
4.4. 

This  proof  only  includes  protocols  PI,  P2,  P2f,  and  P3.  A 
proof  that  also  embodies  protocol  P4  has  been  produced  and 
will  appear  in  a  later  report. 

To  prove  that  all  logs  are  serially  reproducible,  we 

assume  the  converse  and  show  a  contradiction.  That  is,  we 

assume  that  there  is  some  log,  say  LOG  , 

given> 


which 
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resulted  from  the  correct  operation  of  SDD-1  and  that 

LOG  .  is  not  serially  reproducible .  The  general 

given 

approach  we  will  take  is  to  try  to  serialize  LOGgiven 
using  the  transformations  TR1  -  TR7.  When  we  get  stuck, 
as  we  must  since  LOGg^ven  ±s  not  serially  reproducible ,  we 
examine  the  "stuck"  log  and  derive  from  the  log  certain 


properties  of  the  conflict  graph  that  demonstrate  that 


LOG  .  „  must  have  violated  the  protocol  selection  rules 

given  r 


(PSRs).  Thus,  the  proof  proceeds  in  two  stages:  first, 


the  attempt  to  serialize 


LOG 


given  * 

construction  of  the  PSR  contradiction. 


second,  the 


To  serialize  LOG  .  ,  we  begin  at  the  left  end  of  the  log 

given 


and  try  to  serialize  each  R  so  that  it  is  adjacent  to  its 
corresponding  E  and  each  W  so  that  it  is  adjacent  to  its 
corresponding  E.  Suppose,  for  example,  that  we  are  trying 


to  serialize  R‘ 


to  be  adjacent  to  E  .  By  applying 


alpha 

switches  permitted  by  TR1  -  TR7  of  adjacent  symbols  in  the 

a  /t _ r.3  .  .  —  L  ^ L.  _ _ _ _  —  n3 


sublog  that  separate  Rglpha  from  E  ,  we  try  to  move  Raipha 


closer  to  Ea.  That  is,  we  try  to  move  each  symbol  in  this 


sublog  either  to  the  left  of  Ra^pha  or  t0  the  right  of  Ea 


If  we  can  move  all  of  the  symbols  in  this  sublog  out  of 

a  _ 1 1 _ _ i.  _  r~  a 


the  way,  then  we  will  end  up  with  Ra^p^a  adjacent  to  E  . 
We  can  apply  essentially  the  same  procedure  to  move  each 


Wa  to  be  adjacent  to  Ea. 
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Since  L0Ggiven  is  not  serially  reproducible,  this 

procedure  that  tries  to  serialize  LOG  .  will  eventually 

given 

fail  to  be  able  to  serialize  some  R  or  W.  Suppose  that 


Ra,  .  „  cannot  be  serialized  with  Ea. 
alpha 


Then  we  have  a 


sublog  of  the  form  Ralpha  • • •  E3  in  which  every 

intermediate  symbol  is  in  conflict  with  some  symbol  both 

on  its  right  and  on  its  left,  since  otherwise  the  symbol 

would  have  been  removed  by  the  above  applications  of  TR1  - 

TR7 •  Similarly,  had  we  gotten  stuck  by  a  walpha>  we  would 

have  obtained  a  sublog  Ea  ...  Wa,  .  with  the  same 

alpha 

property.  Finding  this  blocked  sublog  completes  the  first 
stage  of  the  proof. 


Suppose,  again,  that  the  blocked  sublog  is  Ralpha  ^ • 
The  second  stage  of  the  proof  begins  with  the  observation 
that  since  every  symbol  in  the  sublog  conflicts  with  some 
symbol  on  its  left  and  right  in  the  sublog  and  since  every 
conflict  corresponds  to  an  edge  in  the  conflict  graph, 
then  there  is  a  path  from  Rglpha  to  Ea  in  the  conflict 
graph.  Furthermore,  we  know  that  the  edge  (R^ipha*  ^  is 
in  Evert»  hence  completing  a  cycle.  Since  Ralpha  lies  on 
a  cycle,  it  is  subject  to  the  PSRs.  By  analyzing  the 
blocked  sublog  in  more  detail,  enumerating  the  possible 
symbols  that  could  be  Raipha's  conflicting  right  neighbor 
and  those  that  could  be  Ea,s  conflicting  left  neighbor,  we 
show  that  in  each  and  every  case  either  Ra.  .  violated  a 
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protocol  it  was  supposed  to  use  according  to  the  PSRs  or 

that  L0Ggiven  must  have  violated  one  of  the  pipelining 

rules.  The  conclusion,  then,  must  be  that  this  blocked 

sublog  could  not  have  arisen  in  the  process  of  trying  to 

serialize  LOG„ .  .  The  very  same  kind  of  argument  can  be 

given  J  ° 

applied  if  the  blocked  sublog  Ea  ...  had  resulted 

from  stage  one.  So,  the  attempt  to  serialize  must 
inevitably  succeed  and  LOGgiven  is  serially  reproducible. 

There  are  numerous  pitfalls  in  this  line  of  proof  that 
require  a  rigorous  approach  to  be  taken.  We  proceed,  now, 
with  this  rigorous  development. 


r- 
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4.2  A  Formal  Model  for  SDD-1 

4.2.1  A  Database  Design 

A  database  design  for  SDD-1  is  a  ten-tuple 

D  =  <DELTA ,  KAPPA,  LAMBDA,  SIGMA,  MATZN ,  logical, 
matzn-of-class ,  stored-data,  readset,  writeset> 

where  the  components  of  D  are  defined  as  follows  (upper 
case  components  are  sets  and  lower  case  components  are 
functions) : 

1.  DELTA  =  {alpha,  beta,  gamma,  delta,...}  is  the  set 
of  all  data  modules. 

2.  KAPPA  =  {5,6,c,...}  is  the  set  of  all  classes. 

3.  LAMBDA  is  the  set  of  all  logical  fragments. 
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4.  SIGMA  is  the  set  of  all  stored  fragments. 

5.  logical:  SIGMA  ->  LAMBDA.  Each  stored  fragment 
sigma  in  SIGMA  is  a  physical  incarnation  of  some 
logical  fragment  specified  by  logical(sigma) . 

6.  MATZN  =  {matzn^,  matzng,  matzn^,  . ..}  is  the  set  of 

all  materializations.  Each  materialization  is  a 
total  function  and  matzn^:  LAMBDA  ->  SIGMA  such 
that  for  each  lambda  in  LAMBDA , 

logicaHmatzn^  lambda) )  =  lambda. 

7.  matzn-of-class:  KAPPA  ->  MATZN.  Each  class  a  in 
KAPPA  runs  in  some  materialization,  specified  by 
matzn-of-class(a) . 

8.  stored-data:  SIGMA  ->  DELTA.  Each  stored  fragment 
sigma  in  SIGMA  is  stored  at  a  data  module, 
specified  by  stored-data(sigma) . 

9.  readset:  KAPPA  ->  2LAMBDA.  Each  class  I  in  KAPPA 
has  a  readset  that  is  a  subset  of  LAMBDA,  specified 
by  readset(I) . 
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10.  writeset:  KAPPA  ->  2LA  .  Each  class  I  in 
KAPPA  has  a  writeset  that  is  a  subset  of  LAMBDA, 
specified  by  writeset(I). 

When  designing  a  database,  one  has  to  specify  data 
distribution  and  class  structure  by  specifying  each  of  the 
above  ten  components. 


4.2.2  Logs 


The  execution  of  the  system  is  completely  characterized  by 
a  log.  Logs  are  built  on  transactions.  We  define  a 
transaction  set  over  a  database  design  D  to  be  a 
four-tuple  TAU(D)  =  <TN,  transclass,  transreadset , 
transwriteset>  where  the  components  of  TAU  are: 

1.  TN  =  {a,b,c ,d , . . . }  is  a  set  of  transaction  names. 

2.  transclass:  TN  ->  KAPPA.  Each  transaction  a  in  TN 
runs  in  a  single  class  specified  by  transclass( a) . 

3.  transreadset:  TN  ->  2LAMBDA.  Each  transaction  a 

in  TN  has  a  readset  that  is  a  subset  of  LAMBDA, 
specified  by  transreadset( a) ,  such  that 
transreadset( a)  is  contained  in 

readset(transclass(a) ) . 
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4.  transwriteset:  TN  ->  2LAM®DA.  Each  transaction  a 
in  TN  has  a  writeset  that  is  a  subset  of  LAMBDA, 
specified  by  transwriteset( a) ,  such  that 
transwriteset( a)  is  contained  in 

writeset(transclass(a) ) . 


A  log  is  a  string  defined  over  a  database  design  D  and  a 
transaction  set  TAU.  The  symbols  of  a  log,  L,  are 
selected  from  the  set  R  +E  +U  ('+'  is  set  union)  where 


R 

E 

w 


tRalpha  :  a11  a  in  TN»  a11  alPha  in  DELTA} 

{ Ea  :  all  a  in  TN} 

{Walpha  :  a11  a  in  TN>  ali  alpha  in  DELTA>- 


A  well-formed  log,  L,  satisfies  the  following 
restrictions : 

1 .  No  element  of  R  +  E  +  appears  more  than  once  in 
L. 


2.  For  each  a  in  TN,  if  Ea  appears  in  L  then  for  all 
alpha  in  DELTA: 

i.  if 

matzn-of -class(transclass(a))( readset(  transclass ( a) ) 
has  a  non-empty  intersection  with 

stored-data*1 (alpha) ,  then  Rglpha  aPPears  in  L  and 
precedes  Ea;*  and 
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ii.  if  transwriteset( a)  has  a  nonempty  intersection 

—  1  a 

with  stored-data  (alpha),  then  walpha  -ppears  in  L 
and  Ea  precedes  walpha*  (Note:  by  "precedes"  we 
mean  "appears  somewhere  in  the  string  to  the  left 
of" . ) 


A  well-formed  log,  L,  satisfies  the  pipelining  rules  if 
for  each  a  and  a'  where  Ea  precedes  Ea'  in  L  and 
transclass( a)  =  transclass( a ' )  then 


1.  (R-R  rule)  for  each  alpha  in  DELTA  where  Ra,  .  and 

r  alpha 

"alpha  are  both  ln  L-  “alpha  Precedes  “alpha! 


2.  (W-W  rule)  for  each  alpha  in  DELTA  where  Walpha  and 


alpha 


are  both  in  L,  W 


alpha 


precedes  W 


alpha ’ 


3.  Mrf-R  rule)  for  each  alpha  in  DELTA  where  Wa  .  and 

alpha 

“alpha  are  both  id  L-  "alpha  Precedea  “alpha- 


*  This  definition  implies  a  READ  message  is  sent  to  alpha 
if  the  materialization  obtains  part  of  the  class  read-set 
from  alpha,  even  if  the  particular  transaction  does  not 
read  any  data  from  alpha.  In  the  implementation  of  SDD-1, 
read  conditions  make  it  possible  to  avoid  sending  the  READ 
messages  in  the  ?.atter  case,  by  adding  extra  read 
conditions  to  the  next  READ  message  that  goes  to  alpha 
from  transclass( a) . 
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if  all  of  the  logs  it  can  generate  have  these  properties. 
Unless  explicitly  stated  otherwise,  in  the  sequel  we 
assume  that  all  logs  are  well-formed  and  satisfy  the 
pipelining  rules. 

4.2.3  Conflict  Graphs 


We  redefine  conflict  graphs,  the  protocols,  and  the 
protocol  selection  rules  in  terms  of  the  above  formalism. 


A  conflict  graph 

CG(D)  r  <VERTICES ,  EDGES> 

is  a  vertex-labelled  undirected  graph  defined  over  a 
database  design  D  as  follows: 
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r  c  _ 

EDGEShor^z  =  {(e  ,  e  )  :  a  i  5  and  the  intersection  of 
writeset(i) ' and  writeset(B)  is  nonempty} 


EDGESdlag  =  «r"lpha,  w“lpha)  :  5  i  B 

and  the  three-way  intersection  of 
matzn-of-class( I) ( read  set ( a) )  3nd 
logical-1 (writeset(B) )  and  stored-data-1 ( alpha) 
is  nonempty} 


In  a  conflict  graph,  CG(D)  ,  a  path  is  a  sequence  (a^,  a2, 
...  ,  an)  where  for  each  i,  1  <  i  <n,  (ait  ai+1)  is  an 
edge  of  CG(D) .  If  a 1  =  an  and  no  edge  appears  twice  in 
the  path,  then  the  path  is  called  a  cycle . 


An  edge  (a^  a^)  in  CG(D)  is  called  heterogeneous  if  the 
two  nodes  have  different  superscripts  (i.e.,  are  in 
different  classes).  A  path  (or  cycle)  is  nonredundant  if 
each  class  is  a  superscript  in  at  most  two  heterogeneous 
edges  in  the  path  (or  cycle). 
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4.2.4  Protocols  and  Protocol  Selection  Rules 


The  protocols  are  now  defined  purely  in  terms  of  logs. 
The  timestamping  mechanisms  described  in  Section  2  can  be 
thought  of  as  a  method  of  implementing  the  protocols. 


A  read  operation  R 


alpha 


satisfies  protocol  P2  in  log  L 


with  respect  to  classes  {a^.,.,1  }  in  KAPPA  if  there 


exists  a  transaction  b  such  that  R  ,  .  satisfies  the 

alpha 


"partitioned  writes  property"  with  respect  to  E°  and 


{a  A  read  operation  R“,  .  satisfies  the 

in  alpha 


partitioned  writes  property  with  respect  to  E°  and 


in  log  L  if  for  each  transaction  c  with  E( 


in 


L  and  transclass( c)  in  {I^,...,a  }: 


1 . 


If  Ec  precedes  Eb  and  appears  in  L,  then 

Walpha  Precedes  R*lpha  in  L;  and 


2. 


If  (b  =  c  or  Eb  precedes  Ec)  and  appears  in 

L>  then  Ralpha  Precedes  W^lpha. 


Two  read  operations  Rglpha  and  Rpeta,  satisfy  protocol  P2f 

with  respect  to  classes  {5. . 5}  and  {a„  , 

1  m  m+1  ’  ’  n 

(respectively)  in  log  L  if  there  exists  a  transaction,  b, 


Page  -104- 
Section  4 


SDD-1  Concurrency  Control  Mechanism 
Proof  of  Serial  Reproducibility 


such  that  Rgipha  satisfies  the  partitioned  writes  property 

b  o 

with  respect  to  E  and  {a1,...,am}  and  Rbeta  satisfies  the 
partitioned  writes  property  with  respect  to  Eb  and 


tam+1 ’ *  * ’ * an } 


A  read  operation  R‘ 


satisfies  protocol  P3  with  respect 


"  ‘  «p^.  cvo-v...  ■>alpha  oauxoi p  ‘  u  vs  v  -l.  i  j  ‘  - - 

to  { a 1 , . . . , In }  in  log  L  if  it  satisfies  the  partitioned 
writes  property  with  respect  to  Ea  and  { a ^ , . . . , an }  . 

Two  remarks  should  be  made  regarding  these  protocols. 
First,  the  protocols  are  mutually  compatible  in  the 
following  sense:  if  Raipha  satisfies  protocol  P3  with 

respect  to  am }  and  Rbeta  satisfies  protocol  P3 

with  respect  to  {Sm+1 ,  then  R!|lpha  and  Rpeta 
satisfy  protocol  P2f  with  respect  to  {a1,...am}  and 
{Im+i , . . • , an }  (respectively)  and  Ralpha  (for  example) 
satisfies  protocol  P2  with  respect  to  { I ^ . Im } .  Second, 


protocol  P2  allows  a  single  R‘ 


to  satisfy  P2  with 


- - .  -  —  -  alpha  -  — 

respect  to  two  different  sets  of  classes  using  two 


different  transaction  b's.  That  is,  R 


alpha 


can  satisfy  P2 


with  respect  to  {a^,...,!^}  because  it  satisfies  the 

partitioned  writes  property  with  respect  to  Eb  and 

{a1 , . . . *In)  while  in  the  same  log  it  satisfies  P2  with 

respect  to  {c.,...,c  }  because  it  satisfies  the 

b  * 

partitioned  writes  property  with  respect  to  E  and 
{ c  , . . . , cm) .  Yet,  there  may  be  no  single  E  such  that 
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Ralpha  sat^-sf^es  the  partitioned  writes  property  with 
respect  to  both  sets.  This  subtlety  cannot  be  handled  by 
the  read  conditions  described  in  Section  2  without  some 
modification. 

We  complete  our  formal  model  by  defining  the  protocol 
selection  rules  (abbr.  PSRs) .  Let  CG(D)  be  a  conflict 
graph  over  the  design  D  and  let  L  be  a  log  defined  over  D 
and  transaction  set  TAU.  Then  L  satisfies  the  protocol 
selection  rules  if  each  of  the  following  hold: 

PSR^.  For  all  alpha  in  DELTA  and  a  in  TAU,  if 
ralpha  (where  a  =  transclass (a) )  lies  on  a 
nonredundant  cycle  in  CG(D)  in  a  subpath  of  the 

£°c"'  '"alpha-  ralpha-  e<i-  "beta'  or  '"alpha- 
ralpha'  '  e°)  for  some  b,  c  in  KAPPA  and  beta  in 

DELTA,  then  Raipha  i-s  L  and  Ralpha  satisfies 
protocol  P3  with  respect  to  class  b  at  alpha  in  L. 


PSR2.  For  all  alpha,  beta  in  DELTA  and  a  in  TAU, 

if  raipha  and  rbeta  ~  transclass (a) )  lie  on  a 
nonredundant  cycle  in  CG(D)  in  a  subpath  of  the 


form  (wbeta'  rbeta '  e*'  ralpha'  ''alpha)  '  for  sorae  5 


and  c  in  KAPPA  (b  ^  c)  ,  then  Rflr^a  and  R^eta 


alpha 


appear  in  L  and  Ralpha  and  Rpeta  satisfy  protocol 
P2f  with  respect  to  {c,  and  {5}  respectively  in  L. 
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PSR3.  For  all  alpha  in  DELTA  and  a  in  TAU,  if 

ra,  .  (a  =  transclass(a) )  lies  on  a  nonredundant 
alpha 

cycle  that  contains  a  vertical  edge  in  CG(D)  in  a 
subpath  of  the  form  <w£lpha,  tj|lpha,  w=lpha)  ,  for 
some  b  and  c  ±n  KAPPA  (b  \  c)  ,  then  R^pha  aPPears 

satisfies  protocol  P2  with  respect 


a 


in  L  and  R  ,  . 

alpha 

to  {b,  c}  in  L. 


A  system  satisfies  the  protocol  selection  rules  if  for  any 
database  design  D  and  transaction  set  TAU,  all  logs 
defined  over  D  and  TAU  satisfy  the  protocol  selection 
rules. 


4.3  Serialization 


Theorem  SR  If  a  system  is  well-formed,  satisfies  the 
pipelining  rules,  and  satisfies  the  protocol  selection 
rules,  then  all  logs  that  it  can  generate  are  serially 
reproducible. 

The  first  stage  of  the  proof  of  this  theorem  is  to  develop 
an  algorithm,  called  the  serialization  procedure,  that 
attempts  to  serialize  a  given  log.  If  the  procedure  gets 
stuck,  then  certain  conditions  are  shown  to  hold  by  lemma 

|  S  (the  serialization  lemma). 

I 
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3.1  Conflicts 


We  begin  by  defining  a  new  log  symbol,  called  a  composite 
atom,  which  is  an  adjacent  group  of  symbols  (R’s,  W*s,  and 
an  E)  that  all  have  the  same  superscript  (i.e.,  all  in  the 
same  transaction)  and  include  an  E.  The  symbolic  notation 
for  a  composite  atom  is  Aa[  »alpha.t .  •  •  -l>alpha-m' 

. . . Walpha-n]-  whlch  15  equivalent  to  the 

sublog 

alpha-1  * '*  "alpha-m  alpha-(m+1)  *’*  "alpha-n' 

Frequently,  we  will  simply  write  Aa  for  the  composite 
atom,  as  an  abbreviation.  Note  that  not  all  Ra,s  and  Wa,s 
that  appear  in  a  log  must  be  members  of  Aa.  The  only  log 
symbol  that  must  be  a  member  of  Aa  is  Ea.  Also,  note  that 
since  for  each  transaction,  a,  no  more  than  one  Ea  occurs 
in  a  log,  therefore  only  one  Aa  can  appear  in  a  log.  The 
introduction  of  composite  atoms  is  simply  a  notational 
convenience  so  that  groups  of  symbols  for  a  single 
transaction  can  be  handled  as  a  unit. 

We  define  an  atom  to  be  either  a  composite  atom  or  an 
isolated  R  or  W  that  is  not  adjacent  to  its  corresponding 
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E  (i.e.  is  not  a  member  of  a  composite  atom).  In  the 
sequel,  we  assume  that  all  logs  consist  of  atoms.  That 
is,  all  E's  are  replaced  by  A*s.  To  do  this,  the  log 
transformations ,  TR1  -  TR7,  and  conflicts,  NTR1-NTR6,  must 
be  extended  to  handle  A's.  The  extensions  are  direct 
consequences  of  the  original  transformations  and  conflicts 
and  the  definition  of  atom.  Since  we  only  need  conflicts 
in  our  proof,  we  will  only  extend  conflicts  and  not  bother 
with  the  transformations .  In  the  following,  note  that 
composite  atoms  are  never  split  up.  The  conflicts  are: 


NTR  •  .  . .  .R*  .  R"  . 

1  alpha  alpha 

transclass( b) . 


.  ..  where  transclass( a) 


NTR  '  W  W 

ni«2  .  .  •  -walpha  walpha 

transclass( b) . 


...  where  transclass( a) 


NTR  '  Ra  yk 

3  *  “'“alpha  walpha 


K  o 

W  Rol„w<s  •••  »  where  either  transclass( a)  = 

alpha  alpha  ’ 

transclass( b)  or  the  three-way  intersection  of 


matzn-of-class  (transclass(a) ) (transreadset(a) )  and 
logical”1 (transwriteset(b) )  and 
stored-data”1 ( alpha)  is  nonempty. 
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NTR4' 


NTR  ' 
o 


Ra  Aa 

“  alpha  A 


AaWa 

*  *  alpha 


NTR^ ' .  ...  A  A  ...  where  at  least  one  of 

following  hold: 


i.  transclass( a)  =  transclass( b) ,  or 


ii.  transwriteset( a)  and  transwriteset(b)  have 


a  non-empty  intersection,  or 


111 •  "llpha  13  ln  *a  and  “alpha  1"  *b  and 
“alpha  “alpha  confUct  b»  NTR3' '  or 

iv-  “alpha  15  ln  *"  and  “alpha  15  in  *6  and 


“alpha  “alpha  co»flic‘  b*  NI“3' 


“Tfi7 1 •  •••  “alpha4" 

either 


•  or  • • • A  Ralpha  • • •  where 


Walpha  is  in  Ab  and  Ralpha  Walpha 


conflict  by  NTR^',  or 


ii.  R‘ 


“alpha  * 
transclass( b) . 


is  in  Ad  and  transclass( a) 


Page  -110-  SDD-1  Concurrency  Control  Mechanism 

Section  4  Proof  of  Serial  Reproducibility 


NTRg '  . 
either 


Wa  &b 

"alpha  H 


or 


AbWa 

alpha 


where 


L.  L 

i.  R;\  ^  is  in  Ad  and 
alpha 

conflict  by  NTR^'  or 


pb  ua 
alpha  "alpha 


H  K 

ii.  ^alpha  is  *n  and  transclass(  a)  = 

transclass( b) . 


Lemma  C  If  a  pair  of  adjacent  atoms  in  log  L  are  not  in 
conflict,  then  the  log  resulting  from  switching  these 
atoms  is  equivalent  to  L. 


Proof  Follows  directly  from  Theorem  TR  in  Section  3.3. 
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4.3.2  Augmented  Conflicts 

In  the  second  stage  of  the  proof,  we  will  frequently  draw 
contradictions  regarding  possible  orderings  of  atoms  in  a 
log  by  appealing  to  certain  protocols.  However,  after  a 
log  has  been  partially  serialized,  many  of  the  atoms  will 
no  longer  be  in  the  same  order  in  which  they  appeared  in 
the  original  log  before  any  attempt  was  made  at 
serialization.  Therefore,  the  fact  that  a  partially 
serialized  log  violates  the  PSRs  does  not  necessarily 
imply  that  the  given  log  violates  the  PSRs.  That  is,  it 
is  only  the  given  log  which,  by  hypothesis,  must  satisfy 
the  PSRs.  Hence,  we  are  unable  to  draw  the  desired 
contradiction . 

What  we  require  is  a  proof  mechanism  to  guarantee  that 
certain  protocol  violations  in  a  partially  serialized  log 
imply  the  same  violations  in  the  given  log.  To  do  this, 
we  introduce  additional  conflicts  (called  augmented 
conflicts) ,  so  that  while  trying  to  serialize  a  given  log, 
we  do  not  destroy  some  of  the  protocol  information.  These 
additional  conflicts  can  be  reflected  in  additional  edges 
in  the  conflict  graph  (called  augmented  edges) .  We 
proceed  by  defining  these  concepts  formally. 
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All  augmented  conflicts  are  between  pairs  of  E's.  Since 
we  have  replaced  E's  by  A's  for  the  purposes  ot  the  proof, 
we  will  state  the  augmented  conflict  rules  in  terms  of 
A's. 

ANTRp2:  . ..AaAb...  if  there  is  a  transaction  c  in 
TAU  (c  i  a,  c  i  b)  and  an  alpha  in  DELTA  such  that 
Ralpha  must  (according  to  the  PSRs)  satisfy  P2  with 
respect  to  transclass ( a)  and  transclass( b)  at 
alpha . 


3  b 

ANTRp^j.:  ...A  A  ...  if  there  is  a  transaction  c  in 

TAU  and  alpha  and  beta  in  DELTA  such  that  Rc.  . 

alpha 

and  R^eta  must  (according  to  the  PSRs)  satisfy 
protocol  P2f  with  respect  to  transclass( a)  and 
transclass( b)  respectively. 


ANTRp3:  ...AaAb  .. 
such  that  either 
PSRs)  run  P3  with 


if  there  is  an  alpha  in  DELTA 
Ralpha  must  (according  to  the 
respect  to  5  or  Rglpha  must 


(according  to  the  PSRs)  run  P3  with  respect  to  i. 


Two  atoms  are  in  augmented  conflict  if  they  conflict  by 
NTR i '  -  NTRg '  or  by  ANTR1  -  ANTRg . 
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Corollary  C-AUG  If  a  pair  of  adjacent  atoms  are  not  in 
augmented  conflict  in  log  Lf  the  log  resulting  from 
switching  these  atoms  is  equivalent  to  L. 


Proof  Follows  immediately  from  lemma  C.  Q.E.D. 


Each  of  the  above  conflicts  must  generate  an  edge  in  the 
conflict  graph.  We  define  an  augmented  conflict  graph, 
ACG(D)=<VERTICES ,  EDGES  +  EDGESaUg> ’  as  a  vertex  labelled 
undirected  graph  defined  over  database  design  D  where 
VERTICES  and  EDGES  are  identical  to  those  in  CG(D)  and 

EDGESaug  13 : 

EDGESaug  -  EBGESaug-P3  *  EDGESaug-P2f  *  EBGESaug-P2 

EDGES  =  {(ea,  eb):  for  all  classes  1,5  in  KAPPA 

aug— r 3 

such  that  there  exists  an  alpha  in  DELTA  such  that 
Le» 

.5 


.a 

alpha 


r“,  _  lies  on  a  nonredundant  cycle  in  CG(D)  in  a 


'beta^  or  ^walpha 


subpath  Cw“lpSa,  r3lpha,  ea,  w? 

ra.  .  ,ea,  ec)  for  some  beta  in  DELTA  and  c  in 
aipna ’ 

KAPPA. } 


EDGES  DO-  =  {(ea,  eb):  for  all  classes  a,  5  in  KAPPA 
aug-P2f 

such  that  there  exists  a  c  in  KAPPA  and  an  alpha 

and  beta  in  DELTA  such  that  r®  .  „  and  r°  lie 

alpha  beta 

on  a  nonredundant  cycle  in  CG(D)  in  a  subpath 

/5  c  cc  a.s 

'Wbetaf  rbeta’  e  ’  ralnha’  walDha^* 
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EDGESgug_p2  _  e^):  for  all  classes  i,  5  in  KAPPA 


such  that  there  exists  a  c  in  KAPPA  and  an  alpha 


in  DELTA  such  that  rg^pha  lies  on  a  nonredundant 


cycle  in  CG(D)  in  a  subpath  (walpha,  rglpha, 
walpha^ ’ 


We  reiterate  that  the  augmented  conflicts  are  required 
only  to  retain  certain  ordering  information  between  E's  in 
a  log,  so  that  this  information  is  not  destroyed  while 
trying  to  serialize  a  log. 


4.3.3  The  Serialization  Procedure 


The  serialization  procedure  takes  a  non-serial  log  and 
tries  to  serialize  it  by  switching  adjacent  atoms  that  are 
not  in  augmented  conflict.  The  actual  serialization  is 
done  by  the  procedures  MOVELEFT  and  MOVERIGHT  which  scan 
the  sublog  that  separates  the  two  atoms  to  be  serialized 
and  tries  to  remove  atoms  from  that  sublog,  thereby 
bringing  the  two  atoms  closer  together.  The  procedure  SP 
repeatedly  calls  MOVELEFT  and  MOVERIGHT  until  the  two 
atoms  have  been  serialized  or  until  the  two  atoms  cannot 
be  brought  closer  together.  The  choice  of  which  atoms  to 
serialize  is  made  by  SERIALIZE,  which  quits  if  either  the 
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given  log  has  been  completely  serialized  or  there  are  two 
atoms  which  cannot  be  serialized. 


SERIALIZE:  PROCEDURE  (Lin,  LQut ,  LEFTATOM,  RIGHTATOM) 

RETURNS  (BOOLEAN); 


/•The  procedure  takes  L.  as  input.  If  L.  is 

in  in 

successfully  serialized,  it  returns  TRUE.  If  not,  it 


returns  FALSE,  and  the  log  L 


is  the  partially 


serialized  log  where  LEFTATOM  and  RIGHTATOM  is  the  pair  of 
atoms  that  could  not  be  serialized.*/ 


I  •  —  I  • 

out  *'  Lin ’ 


DO  FOREVER; 

Select  the  leftmost  atom  in  L  .  that  is  either 
- out 

i.  an  atom  Aa  and  there  is  an  alpha  with 


Ralpha  in  Lout  but  R 
not  in  Aa;  or 


alpha 


ii.  an  atom  w|lpha  In  Lout  but  W*lpha  ta 
not  in  Aa; 


IF  there  is  no  (i)  or  (ii)  THEN  RETURN  (TRUE); 


IF  (i)  is  the  case  satisfied  above 
THEN  BEGIN  LEFTATOM  :=  rightmost  Ra  in  L_t  but 
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not  in  Aa ;  RIGHTATOM  :=  Aa ;  END; 

ELSE  BEGIN  LEFTATOM  :=  Aa ;  RIGHTATOM  :=  Wa  .  ; 

’  alpha’ 

IF  NOT  SP(LQut,  LEFTATOM,  RIGHTATOM) 

THEN  RETURN  (FALSE); 

ELSE  MERGE  LEFTATOM  and  RIGHTATOM  into 
a  single  Aa; 


END 

END  SERIALIZE; 


SP:  PROCEDURE  (LOG,  LEFTATOM,  RIGHTATOM)  RETURNS 

(BOOLEAN) ; 

TEMPI  :=  TEMP2  :=  TRUE; 

DO  WHILE  ((TEMPI  or  TEMP2)  and  (LEFTATOM  is  not 
adjacent  to  RIGHTATOM)); 

TEMPI  :=  MOVELEFT  (LOG,  LEFTATOM,  RIGHTATOM); 

TEMP2  :=  MOVERIGHT  (LOG,  LEFTATOM,  RIGHTATOM); 

END; 

IF  (LEFTATOM  is  adjacent  to  RIGHTATOM) 

THEN  RETURN  (TRUE); 

ELSE  RETURN  (FALSE); 


END  SP; 
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MOVELEFT:  PROCEDURE  (LOG,  LEFTATOM,  RIGHTATOM)  RETURNS 

(BOOLEAN) ; 

TEMPLOG  :=  LOG;  TEMPOUT  :=  FALSE; 

DO  FOR  EACH  atom,  X,  between  LEFTATOM  and  RIGHTATOM 
in  LOG  beginning  with  the  right  neighbor  of  LEFTATOM 
and  moving  right; 

DO  WHILE  ((left  neighbor  of  X  in  TEMPLOG  is  not  in 
augmented  conflict  with  X)  AND  (right  neighbor  of 
X  is  not  LEFTATOM)); 

Switch  X  with  its  left  neighbor  in  TEMPLOG; 

END; 

IF  (right  neighbor  of  X  in  TEMPLOG  is  LEFTATOM) 

THEN  TEMPOUT  :=  TRUE; 

END; 

LOG  :=  TEMPLOG; 

RETURN  (TEMPOUT); 


END  MOVELEFT: 
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MOVERIGHT:  PROCEDURE  (LOG,  LEFTATOM,  RIGHTATOM)  RETURNS 

(BOOLEAN) ; 

TEMPLOG  :=  LOG;  TEMPOUT  :=  FALSE; 

DO  FOR  each  atom,  X,  between  RIGHTATOM  and  LEFTATOM 
in  LOG  beginning  with  the  left  neighbor  of  RIGHTATOM 
and  moving  left; 

DO  WHILE  (right  neighbor  of  X  in  TEMPLOG  is  not  in 
augmented  conflict  with  X)  AND  (left  neighbor  of  X 
is  not  RIGHTATOM) ) ; 

Switch  X  with  its  right  neighbor  in  TEMPLOG; 

END; 

IF  (left  neighbor  of  X  is  RIGHTATOM) 

THEN  TEMPOUT  :=  TRUE; 

END; 

LOG  :=  TEMPLOG; 

RETURN  (TEMPOUT); 


END  MOVERIGHT; 


SDD-1  Concurrency  Control  Mechanism  Page  -119 

Proof  of  Serial  Reproducibility  Section  4 


4.3.4  The  Serialization  Lemma 

If  the  serialization  procedure,  SERIALIZE,  is  given  a  log 
that  is  not  serially  reproducible,  then  certain  properties 
must  be  true  of  the  output  of  SERIALIZE.  These  properties 
are  summarized  in  lemma  S  presented  in  this  section. 

First,  we  require  two  new  definitions.  A  log,  ,  is  a 
projection  of  a  log,  L 2,  if  L1  can  be  obtained  from  L 2 
simply  by  excising  atoms  from  L 2.  A  log,  L,  is  blocked  if 
every  atom  in  L  is  in  augmented  conflict  with  both  its 
left  and  right  neighbors  in  L.  Our  goal  in  lemma  S  will 
be  to  construct  a  blocked  projection  of  the  log  that 
SERIALIZE  outputs. 

Lemma  S  Let  LOGgiven  be  a  well-formed  log  defined  on  the 
database  design  D.  If  LOG  .  is  not  serially 

given 

reproducible ,  then 

I.  SERIALIZE  (LOGgiven,  LOGQut ,  LA,  RA)  returns 
FALSE; 

II.  every  atom  of  the  form  Wa  .  to  the  left  of  RA 

alpha 


in  LOG_  .  is  a  member  of  Aa; 


Page  -120- 
Section  4 


SDD-1  Concurrency  Control  Mechanism 
Proof  of  Serial  Reproducibility 


III.  every  atom  of  the  form  Aa  to  the  left  of  RA  in 

LOG  .  has  no  Ra's  in  LOG  .  that  are  not  members  of 
out  out 


IV.  there  is  a  blocked  projection,  LOG.,  .  . ,  of 

’  blocked ’ 

LOG  .  such  that 
out 


i.  LA  and  RA  are  the  leftmost  and  rightmost 


atoms  of  L0Gb2Ocked  respectively; 

ii.  there  is  an  a  in  TAU  and  an  alpha  in  DELTA 


such  that  either  (LA  =  Ra  .  and  RA  =  Aa)  or  (LA 

alpha 

=  Aa  and  RA  =  Wa  .  )  . 

alpha' 


Proof  (Part  I)  Since  only  equivalence  preserving 
transformations  are  attempted  by  SERIALIZE  (by  corollary 
C-AUG) ,  if  LOGgiven  is  not  serially  reproducible  then 
SERIALIZE  must  fail  to  serialize  it  and  therefore  returns 
FALSE. 


(PARTS  II  and  III)  The  last  atom  selected  by  SERIALIZE  was 
the  leftmost  atom  that  was  either  a  W  not  in  any  A  or  an  A 
with  an  outstanding  R.  Hence,  there  can  be  no  atoms  to 
the  left  of  RA  in  LOGQut  with  either  of  these  properties. 

(PART  IV)  Construct  LOGblocked  from  L°Gout  as  follows: 

Excise  all  atoms  to  the  left  of  LA  and  to  the  right  of  RA 

in  LOG  , .  Let  X  be  LA's  right  neighbor.  Let  Y  be  an 
out 

atom  in  the  log  somewhere  to  the  right  of  X  that  conflicts 
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with  X.  There  must  be  such  a  Y,  for  otherwise  MOVERIGHT 
would  have  moved  X  to  the  right  of  RA.  Excise  all  atoms 
in  LOGQut  between  X  and  Y.  If  Y  ^  RA,  then  set  X  :=  Y  and 
find  a  Y  to  the  right  of  X  that  conflicts  with  X  as 
before.  Repeat  this  process  until  Y  =  RA.  The  resulting 

log,  L°Gblocked’  *s  a  Pr°jecti°n  °**  bGG0Ut  anb  *s  bl°cked 
(by  construction).  Furthermore,  by  the  choice  of  LA  or  RA 

in  SERIALIZE,  IV  (ii)  must  hold.  Q.E.D. 

While  lemma  S  shows  that  every  non-serially  reproducible 
log  will  fail  to  be  serialized  by  SERIALIZE,  it  does  not 
claim  that  if  a  log  is  serially  reproducible  then 
SERIALIZE  will  succeed.  This  converse  is  not  in  general 
true,  for  the  transformations  we  use  are  not  complete,  as 
mentioned  in  Section  3.3.  If  we  were  able  to  find  a  more 
complete  set  of  transformations ,  then  this  might  permit  us 
to  weaken  our  protocols;  for  some  of  the  serially 
reproducible  logs  that  are  not  serializable  under  our 
current  transformations  may  no  longer  require  a  strong 
protocol  to  guarantee  that  they  will  not  occur. 
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hi 


Showing  Nonserializable  Logs  are  Impossible 


The  proof  of  theorem  SR  is  embodied  in  two  major  lemmas. 
We  first  present  the  structure  of  the  proof  and  then 
proceed  to  the  lemmas. 


Theorem _ SR  If  a  system  is  well-formed,  satisfies  the 

pipelining  rules,  and  satisfies  the  protocol  selection 
rules,  then  all  logs  that  it  can  generate  are  serially 
reproducible . 


Proof  Assume  the  theorem  is  false.  Then  there  is  a  log, 


say  LOGgiven,  which  is  well-formed,  satisfies  the 


pipelining  rules,  and  satisfies  the  protocol  selection 
rules,  but  is  not  serially  reproducible .  By  lemma  S, 
SERIALIZE  (L0Ggiven>  L0Gout’  LA '  returns  false  and,  by 


IV(ii)  there  is  a  transaction  a  in  TAU  and  an  alpha  in 


DELTA  where  either  (LA  =  Ra  .  and  RA  =  Aa)  or  (LA  =  Aa 

alpha 


and  RA  =  W  .  .  ).  These  possibilities  are  shown  below  to 

alpha 


be  impossible  by  lemmas  RA  and  AW  respectively.  Hence, 
the  conclusions  of  lemma  S  were  false.  But  this  is 
possible  only  if  the  hypothesis  of  lemma  S  is  false.  So, 


the  hypothesis  that  LOGg^ven  was  not  serially  reproducible 


must  be  false.  Q.E.D. 
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To  prove  lemmas  RA  and  AW  we  will  use  the  following 
lemmas . 


Lemma _ P  Let  L  be  a  blocked  log  over  transaction  set  TAU 

and  database  design  D  such  that  the  leftmost  atom  is  in 
class  I,  the  rightmost  atom  is  in  class  5,  and  the  log  has 
no  atom  in  class  c.  Then  there  is  a  path  in  CG(D)  which 
is  not  incident  with  any  node  in  class  c. 


Proof  If  a  =  5,  then  the  lemma  is  trivially  true.  If  a  i 
B,  then  since  the  log  is  blocked,  each  atom  is  in 
augmented  conflict  with  its  neighbors.  Each  such  conflict 
corresponds  to  an  edge  in  ACG(D),  so  there  is  a  path  from 
I  to  5  in  ACG(D)  that  is  not  incident  with  c.  To  find  a 
new  path  in  CG(D)  we  need  to  replace  each  edge  in  the  old 


path  that  is  in  EDGES  by  a  path  in  CG(D).  Consider 
some  edge,  say  (ed,  e^) ,  in  the  path  in  EDGES  .  if  the 

dUg 


edge  is  in  EDGESaug_p^,  then  replace  it  by  the  path  (e  , 


3  T  7 

“alpha’  ralpha’  ®  ^  that  must  exist  by  definition  of 


EDGESaug_p^.  If  the  edge  is  in  EDGESaug_p2f ,  then  there 
is  a  class,  g,  and  data  modules  alpha  and  beta  such  that 


the  subpath  <ed,  wjjlph(|>  r°lpha,  e=,  r 


.g 


.8  r*g 


beta’  “beta 


,  e1)  is 


in  CG(D)  and  there  is  a  path  in  CG(D)  from  3  to  7  that  is 


3  ? 

not  incident  with  g.  If  g  i  c,  then  replace  (e  ,  e1 )  by 


the  subpath  (which  is  not  incident  with  c).  If  g  =  c, 


then  replace  (e  ,  ef)  by  the  other  3  -  T  path, 


If  the 
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edge  is  in  EDGES  DO,  then  there  is  a  class,  g,  and  a 
datamodule,  alpha,  such  that  there  is  a  subpath  (e^, 

“alpha-  Palpha ’  "alpha-  'f)  ln  CGlI))  and  there  is  a  path 

in  CG(D)  from  3  to  F  that  is  not  incident  with  g.  If  g 

i  c,  then  replace  (e  ,  e1)  by  the  subpath;  else  replace  it 

by  the  other  3  -  ?  path.  If  all  edges  in  EDGES  are 

aug 

replaced  in  this  way  by  paths  in  CG(D)  that  are  not 

incident  with  class  c,  then  we  have  constructed  a  path  in 
CG(D)  with  a  node  in  class  c.  To  make  the  path 

nonredundant ,  simply  replace  each  nontrivial  subpath  whose 
endpoints  are  in  the  same  class  by  vertical  edges.  This 
nonredundant  path  then  satisfies  the  lemma.  Q.  E.  D. 


Lemma  B  Let  L  be  a  log  defined  over  transaction  set  TAU 
and  database  design  D.  Let  LQut  be  a  log  obtained  from  L 
by  the  serialization  procedure,  and  let  L^ut  be  a 
projection  of  Louf  Let  Xa  and  Yb  be  symbols  (i.e.,  not 
atoms)  that  are  in  augmented  conflict  such  that  Xa 
precedes  Yb  in  L^  t.  Then  Xa  precedes  Yb  in  L. 


Proof  Since 

atoms  that  are 

appeared  in 

hold  in  L •  .  , 
out  ’ 


the  serialization  procedure  never  switches 

•  Q  K 

xn  augmented  conflict,  X  and  Y  must  have 

the  same  order  in  L  and  L  . .  The  same  must 

out 

since  the  latter  is  a  projection  of  LQut. 


Q.  E.  D. 
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Lemma  RA  Let  LOG.....  __  be  a  log  defined  over  database 
-  given 

design  D  and  transaction  set  TAU  such  that  it  is 
well-formed,  satisfies  the  pipelining  rules,  and  satisfies 
the  protocol  selection  rules.  Then  it  is  not  possible 
that  SERIALIZE (LOGgiven ,  LOGQut,  LA,  RA)  returns  FALSE 
with  LA  =  R3ipha  and  RA  =  A3  lor  some  a  in  TAU  and  alpha 
in  DELTA. 


Proof  Assume  that  the  lemma  is  false.  Then,  SERIALIZE 


returns  FALSE  and,  by  lemma  S,  there  is  a  blocked 
projection  of  L0Gout,  say  L°GbloCi.ec| ,  whose  leftmost  and 
rightmost  atoms  are  Rglpha  and  Aa  respectively.  That  is, 


L0Gblocked  13  °f  the  for”  “alpha  • • •  A 


a 


Beginning  with  Aa ,  scan  left  in  L0Gblockecj  until  the  first 
occurrence  is  found  of  an  Ra^tg  where  Rbgta  is  not  in  its 
A  and  transclass( a' )  =  transclass( a) .  (Note:  possibly 
alpha  =  beta,  and  possibly  Rbata  =  Rgipha’^  Now>  startin6 
with  Rbeta,  scan  right  in  EOGblocked  until  the  first  Aa 
is  found  with  transclass( a" )  =  transclass( a) . 

all  a 

We  want  to  show  that  A  is  actually  A  .  So  suppose  not, 
i.e.,  a"  i  a.  Since  Rbetg  not  A3  (by 

construction),  by  lemma  S  part  III,  Ea  must  precede  Ea'  in 
LOGout*  Since  Ea  and  Ea  conflict,  by  lemma  B  Ea  preceded 
Ea  in  L0Ggiven  (°r  Ea  =  Ea').  Since  transclass( a" )  = 

O  II  Q 

transclass( a) ,  and  since  E  and  E  conflict  in 
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L0Gblocked *  by  lemma  B  Preceded  Ea  in  L0Ggiven*  By 

a  n 

lemma  S  part  III,  A  must  contain  all  of  its  R's, 
including  Ra^a.  Since  Ra^ta  precedes  Raata  in  L0Gblocked 
and  Ragta  conflicts  with  Rb^ta,  by  lemma  B  Rbeta  preceded 

Rbeta  in  L0Ggiven*  But  since  Ea"  Preceded  Ea ' ,  this  is  a 
violation  of  R-R  pipelining.  So,  we  have  a"  =  a,  as 
desired.  That  Is,  Is  of  the  form  R=lpha  ... 


>a 


R^,  ...  Aa  ...  Aa',  such  that  no  A°  with  transclass( a" ) 

beta 

=  transclass(a)  appears  between  R?  ,  and  Aa. 

beta 


We  create  a  new  log,  LOGblQcked,  by  excising  from 

a  1 

^GGblocked  those  atoms  to  the  left  of  Rbgta  and  those 
between  Aa  and  Aa  .  Since  Aa  -  Aa  conflict,  L0Gblocked 
is  indeed  blocked.  So,  EOGblocked  is  of  the  form  Rpeta 
...  Aa  Aa  (where  possibly  a'  =  a) . 


Consider  Raeta.  There  are  only  two  kinds  of  atoms  that 


>a ' 


can  be  Rbeta's  conflicting  right  neighbor:  either  an 


R^V.  where  transclass(  a" )  =  transclass(a* ) :  or  a 

beta  beta 


where  transclassC a ' )  t  transclassC b)  and  the  three-way 
intersection  of 

matzn-of-cl ass (transclassC a’ ))(transreadset(a’))  and 

—  1  —  1 

logical  ( transwriteset( b) )  and  stored-data  (beta)  is 


nonempty.  By  choice  of  Raatg,  Rbdta  is  not  possible.  So, 


R 


beta's  ri®ht  neighbor  must  be  Wbetg.  (Note:  possibly 


alpha  s  beta).  By  lemma  S  (part  II),  W?  , 


is  a  member  of 
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Let  Xc  be  the  left  neighbor  of  Aa  in  LOGb^ocked .  That  is, 


LOG 


blocked 

I 


is  of  the  form  R^taAb[  ..  .Wbeta. .XcAa' 


Since  LOGblQcked  is  blocked,  X  and  A  are  in  augmented 

conflict.  (Note:  possibly  c=b,  Ab=Xc).  By  the  above 

a 11 

argument  regarding  A  ,  transclass( a)  i  transclass( c) . 


In  the  remainder  of  this  proof,  let  a  =  transclass( a) ,  6  = 
transclass( b) ,  and  c  =  transclass(c) . 

Claim  RA-path  There  is  a  nonredundant  path  in  CG(D)  from 
a  node  labelled  5  to  a  node  labelled  c  such  that  the  path 
passes  through  no  other  node  labelled  I. 


This  claim,  which  follows  directly  from  lemma  P,  will  be 
applied  repeatedly  in  the  remainder  of  the  proof. 


In  the  remainder  of  the  proof,  we  analyze  the  ways  in 
which  Xc  can  be  in  augmented  conflict  with  Aa ,  and  show 
each  possible  conflict  to  be  impossible.  Since  the  only 
assumption  made  so  far  is  that  the  lemma  is  false,  the 
contradiction  that  Xc  does  not  conflict  with  Aa  will  prove 
the  lemma. 

Xc  can  only  be  in  augmented  conflict  with  Aa  due  to  one 
of  NTR 1 '  -  NTRg ' ,  ANTRp2 ,  ANTRp2f,  or  ANTRpg .  Since  c  i  5 
(by  construction),  NTR 1 ' ,  NTR2' ,  NTR^ ' ,  NTRg’,  NTRg' ( i) , 
and  NTR^'(ii)  cannot  be  the  cause  of  the  conflict.  NTR^' 
trivially  does  not  apply,  since  it  does  not  apply  to  an  A. 
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NTRg'  cannot  apply  because  by  lemma  S,  Xc  cannot  be  a  Wc 
that  is  not  a  member  of  Ac .  The  remaining  cases  are 
NTR6'(ii),  NTRg ' ( iii) ,  NTRg'Uv),  NTRy •  ( i)  ,  ANTRp2, 
ANTRp2f.,  and  ANTRp^;  they  are  subsumed  by  the  following 
cases : 


I.  Xc  =  Ac; 
in  Ac  and 
intersection 
and 

stored-data”1 


there  is  a  gamma  in  DELTA  with  W 


R‘ 


in  Aa ; 


gamma 
and  the  three-way 


gamma 

of  matzn-of-class(a)(transreadset(a)) 


-1 


logical”  (transwriteset(c) ) 


and 


(gamma)  is  nonempty. 


II.  there  is  a  gamma  in  DELTA  such  that  either  X 


R"  „  or  (X^  =  A'"  and  R^_mm_  is  in  Ac);  VT  is 

gamma  gamma  ’  gamma 

in  Aa;  and  the  three-way  intersection  of 


matzn-of-class(c) ( transreadset( c) )  and 
logical”1 ( transwriteset( a) )  and  stored-data”1 (gamma) 
is  nonempty. 


III.  Xc  =  Ac  and  the  intersection  of  transwriteset( c) 
and  transwr iteset( a)  is  nonempty. 


IV.  Xc  =  Ac  and  Ac-Aa  are  in  augmented  conflict  by 
ANTRp2,  ANTRp2f,  or  ANTRp^ . 


We  analyze  each  of  the  four  cases  in  detail. 
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0  0  c  3  3 

Case  I  (X  =  A  and  contains  Vf _ ;  A  contains  R„„ _ „ ; 

-  gamma’  gamma’ 

"gamma  and  "gamma  confUct) 

L0Gblocked  is  of  the  form  "beta  A  ^  '"beta' *'■***  * 

AC [  .  . . W°  ]  Aa[  .  . .Ra  .  .  .]  Aa '  . 
gamma  gamma 

There  are  two  subcases  to  consider:  beta  i  gamma  and  beta 
=  gamma . 


Subcase  beta  i  gamma 

t 

From  the  a'  -  b  and  c  -  a  conflict  in  LOG.,  ,  the 

blocked ’ 

edges  (ra  ,  w?  .)  and  ( ,  ra  )  are  in 
beta  beta  gamma’  gamma 

CG(D) . 


By  definition  of  EDGESvort. ,  the  edges  (raetg,  ea)  and 


vert 1 

(rfamma»  e3)  are  in  CG(D). 
gamma 


By  claim  RA-path,  there  is  a  nonredundant  path  in 

CG ( D )  from  w!?  .  to  w°  ___  that  does  not  pass  through 
D6ia  gamma 

any  nodes  in  class  a.  Graphically,  we  have  the  cycle 
noted  in  figure  4.1. 


This  cycle  and  the  protocol  selection  rules  imply 

3*3*  — 

R„„„„  and  K  4.  must  satisfy  protocol  P2f  against  c 
gamma  beta 


and  5  (respectively)  at 


gamma 


and  beta 


(respectively).  The  following  sequence  of  inferences 


leads  to  a  contradiction. 


i.  Since  E  and  E  are  in  augmented  conflict  and 
Ea  precedes  Ea'  in  L0Gbiocked’  by  lemma  B  ^ 

3  * 

precedes  E  in  LOG  .  By  R-R  pipelining, 

o  ^  ’  c  u 

precedes  Ra  in  LOG.,  , . 

gamma  gamma  given 

ii.  _ conflicts  with  m  ,  so  by  lemma  B 

gamma  gamma’ 

Ra  followed  W°  a  in  LOG 

gamma  gamma  given 


iii.  By  (i),  (ii)  and  transitivity,  W 

3  * 

precedes  R  in  LOG  . 

p  gamma  given 


gamma 


iv.  By  definition  of  NTRp2^,  Eb  and  Ec  are  in 

b  c 

augmented  conflict.  So  by  lemma  B,  E  precedes  E 
in  LOG „ . _ . 
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v.  Since 

Rk\ 

beta 

and  W^eta  conflict  and 

R3' 

Kbeta 

precedes 

b  . 

beta  in 

L0Gblocked>  bV  lerama  B 

Ra ' 
Kbeta 

precedes 

Wbeta  ln 

L0Ggiven- 

vi.  But  (iii),  (iv),  and  (v)  constitute  a 
violation  of  the  partitioned  writes  property  for 

Rbeta  and  Rgamma  with  resPect  to  b  and  5 
respectively.  So,  a'  violated  protocol  P2f,  a 

contradiction.  This  proves  case  I,  subcase  beta  i 

gamma . 


Subcas e  _b  e  ta  =  gamma 

t 

In  this  case,  b0GbiOcked  is  of  the  form: 


If  a  =  a'  then  Rbeta  isn't 

contradiction.  If  a  i  a'; 

conflict,  by  lemma  B 

LOG  .  .  Since  Ea  and  Ea 

given 

3  * 

precedes  E  in  LOG„. 

given 

pipelining.  Contradiction! 


.  .W 


c 

beta 


]Aa[ 


R 


a 

beta’  * 


] 


unique  in  the  log,  a 
then  since  R®’ta  and  Ragta 

Rbeta  precedes  Raeta  in 
conflict,  by  lemma  B  Ea 
This  is  a  violation  of  R-R 


Case  II 


(either  R 


gamma 


Xc  or  Rc 


is  in  A 


Wa  _  is  in  Aa 
gamma 


and  R  and  W 

gamma  gamma 


gamma 

conflict) 


Xc; 


LOG,  .  .  .  is  either  of  the  form 

blocked 
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RbetaA  ^ ‘ '^beta • • • J • • • RgammaA  ^ *  * ’^gamma • • • ^ 
or 


a  ' 


.AbC-.-wL^. ••]... A°t...RC  ...]Aa[...Wa  ...]A 

beta  beta  gamma  gamma 


a ' 


From  the  conflicts  in  LOGblboked ,  the  edges  Cr£eta,  «beta) 
and  (rgamma’  “gamma*  are  in  CG(D>- 

By  definition  of  EDGESvert,  the  edges  (rbeta>ea)  and 

(ea,wa  )  are  in  CG(D). 
gamma 

By  claim  RA-path,  there  is  a  nonredundant  path  from  wbgta 
to  rgamma  that  does  not  pass  through  any  node  in  class  a. 
So  we  have  the  nonredundant  cycle  shown  in  figure  4.2. 


This  graph  and  the  protocol  selection  rules  implies  that 
Rbeta  must  satisfy  protocol  P3  with  respect  to  5  at  beta. 
By  ANTRp^,  Eb  and  Ea  are  in  augmented  conflict.  Since  Eb 

blocked’  by  lemma  B  ^  precedes  Ea  in 


precedes  E  in  LOG 


LOGgiver .  By  the  same  argument,  we  deduce  that  Ea 
precedes  Ea  in  LOGgiyen.  By  transitivity,  Eb  precedes 

a ?  a  t  k 

E  in  L0Ggiven-  Since  Rbeta  and  Wbgta  conflict,  by  lemma 

B  Rbeta  Precedes  wSeta  ir‘  L0Ggiven-  This  is  a  violation 
of  P3,  a  contradiction,  thereby  proving  case  II. 


Case  III  (Xc  =  Ac  and  the  intersection  of 
transwriteset(c)  and  trar.swr  iteset  (  a )  is  nonempty. 


LOG ’ ,  .  .  is  of  the  form  R.  .  A  [... W.  , Q A~A“A“  . 

blocked  beta  beta 

(This  argument  is  essentially  the  same  as  Case  II.) 


From  the  conflicts  in  LOGblQcked,  the  edges  (r®eta,  wbeta> 
and  (ea,  ec)  are  in  CG(D). 

By  definition  of  EDGESvert,  the  ecl8e  ^rbeta*  is  in 

CG  (  D )  . 


By  claim  RA-path,  there  is  a  nor.redundant  path  from  w£eta 
to  ec  that  does  not  pass  through  any  node  in  class  a.  So, 
we  have  the  nonredundant  cycle  shown  in  figure  4.3. 


This  graph  and  the  protocol  selection  rules  imply  Rbeta 
must  satisfy  P3  with  respect  to  c  at  beta.  By  ANTRp^,  Eb 
and  Ea  are  in  augmented  conflict.  Since  Eb  precedes  Ea  in 

1  ha 

LOG.  .  .  . ,  by  lemma  B  E  precedes  E  in  LOG  .  .  By  the 

blocked  given  J 
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same  argument,  we  deduce  that  Ea  precedes  Ea  in  LOG  . 

r  given 

b  3  *  3 1 

By  transitivity,  E  precedes  E  in  LOG  .  .  Since  R.  . 

J  j*  k  given  beta 

ar'd  “beta  oor'fliot-  »y  lemma  B,  R^ta  precedes  W^eta  in 


LOG  . 


This  violates  P3,  a  contradiction,  thereby 


Nonredundar.t  cycle,  for  Case  III  of  Lemma  RA  Figure  4.3 


beta 


proving  case  III. 


beta 

a  nonredundant  path  with 
_  no  nodes  labelled  a 


CASE  IV  (X  =  Ac  and  Ac  -  Aa  are  in  augmented  conflict  by 


ANTR 


p2,  ANTRp2f,  or  ANTRpn . ) 


L0Gblocked  is  of  the  form 


Ra '  A^r  ub  1  AcAaAa' 

beta*  L  ’  •  •"beta* *  * J  * * ’ 


From  the  log,  the  edge  wbeta^  is  in  CG^D)*  From 
claim  RA-path,  there  is  a  nonredundant  path  in  CG(D)  from 
a  node  in  c  to  a  node  in  5  that  does  not  pass  through  any 
node  in  a.  There  are  now  three  subcases  to  consider  for 
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each  of  ANTRp2,  ANTRp2f  and  ANTRp^  —  the  only  ways  that 

C  3 

A  -  A  can  be  in  augmented  conflict. 


Subcase  I V  -  ANTR p 2 

By  ANTRp2,  there  is  a  class,  3,  and  a  data  module,  gamma, 
such  that  there  is  a  r.onredundant  cycle  in  CG(D)  with  the 
subpath 

/  c  d  a  v 

( w  r  w  1 

gamma’  gamma’  gamma 

So,  we  can  deduce  that  the  edges  (wc  .  rd  )  and 

gamma’  gamma 

( rgamma  ’  "gamma'  are  in.  CG(D>'  B*  _  deE: ir‘ition  °f 
EDGESvert’  the  edges  <rbeta’  ^  ar‘d  (#4>  "gamma'  are  ir' 

3  b 

CG( D)  .  Hence,  given  («*beta,  wbeta^  above,  we  have  the 
nonredur.dar.t  path  in  CG(D)  of 

(wbeta’  rbeta ’  e*  •  wgamma)’ 


To  complete  a  r.onredundant  cycle,  we  need  an  independent 

r.onredundant  path  from  w®  to  wL,.  If  3  =  B,  then  we 

gamma  beta 

are  done  since  the  edges 


gamma ’ 


a  \ 

wgamma ' ’ 


gamma ’ 


eb), 


wb  ) 

wbeta; 


suffice 


If  ^  £  B,  then  the  edges  (w®  ,  rd  and 

_  gamma’  gamma 

together  with  the  known  r.onredundant  path 
gamma’  gamma  °  K 

from  c  to  B  suffices.  (If  the  path  intersects  3,  then  the 
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( Hgamma  ’  wgamma)  edge  can  be  removed  and  replaced  by 

3  _ 

vertical  edge(s)  connecting  r  to  the  c  -  5  path.) 

gamma 

So,  we  have  a  nonredundant  cycle  (see  figure  4.4). 


By  the  protocol  selection  rules,  Rff  must  satisfy  P3 

06 1)3 

with  respect  to  E  at  beta.  However,  P3  is  violated  in  the 
following  way.  By  ANTRp^  and  the  cycle,  Ea  and  Eb  are  in 
augmented  conflict.  So,  since  Eb  precedes  Ea  in 

L0Gblocked’  by  lemrna  B*  ^  precedes  Ea  in  LOG  ivgn .  As 

deduced  earlier,  Ea  precedes  Ea '  in  LOG  .  .  Bv 

given  J 

transitivity,  Eb  precedes  Ea '  in  LOG  iverj.  Since  R^eta 


flicts  with  Wbgta,  by  lemma  B  Ra^fa  precedes  w| 


LOG  .  This  violates  P3,  a  contradiction,  thereby 

5  1  V  C  iJ 

proving  the  subcase. 


Subcase  IV  ANTR. 


By  ANTRp2f,  there  is  a  class,  3,  and  two  distinct  data 
modules,  gamma  and  delta,  such  that  there  is  a 
nonredundant  cycle  in  CG(D)  with  the  subpath 

(wc  3  3  _3  a  . 

gamma’  rgamma’  e  ’  rdelta’  wdelta;’ 

We  can  now  continue  exactly  as  in  subcase  IV  -  ANTRp2, 
yielding  the  same  P3  violation  (see  figure  4.5). 


Subcase  IV  _-__AN TR p ^ 

By  ANTRp^,  there  is  a  data  module,  gamma,  such  that  there 


ncy  Control  M 
1  Reproducibi 

Nonredundant  Cycles,  for  Case  IV-ANTRp2 
of  Lemma  RA 


Figure  4.4 


Subca se  5  \  5 


If  the  E  -  c  path  intersects  3,  then  we  have  vertical 
edge(s)  from  rgamma  zo  the  path,  thereby  completing  the 
cycle  in  a  slightly  different  way. 


is  a  nonredundant  cycle  in  CG(D)  with  either  the  sub path 


Page  - 1  38- 
Sect  ior.  4 


SDD-1  Concurrency  Control  Mechanism 
Proof  of  Serial  Reproducibility 


Nor redundant  Cycle ,  for  Case  IV  -  ANTRpjf  Figure  4.5 


r  r  r 

beta  delta  gamma 


/  c  c  a  as 

’  "gamma’  wgamma’ 


or  (e  ,  rgamma»  wgamma’  e  ^  * 


We  treat  each  subpath  as  a  separate  case. 


Subcase  IV  -  ANTRDO  -  ( ec ,  r®  mfit  wa  ea) 

_ P3  gamma’  gamma’ _ 


We  can  deduce  that  the  edge  (rgamma,  waamma)  is  in  CG(D). 
We  can  now  continue  exactly  as  in  Subcase  IV  -  ANTRp2» 
yielding  the  same  P3  violation.  (See  figure  4.6.) 


Subcase  IV  -  ANTRDO  -  (ea,  ra  m  .  w®  .  ec) 
_ P3 _ ’  gamma _ gamma _ 


Since  ea-  ec  are  in  EDGES  there  is  a  nor.redundant 

aug-P3 

path  from  ea  to  ec  that  does  not  pass  through  any  r 


(including  rt>eta^*  from  claim  RA-path,  there  is  a 
nonredundant  path  from  c  to  5  that  does  not  pass  through 


0)  I 


Nonredundant  Cycle  for  Case  IV-ANTR 


( e°  ,  r^  ed )  of  Lemma  RA 

gamma’  gamma’ 


Figure  4 . 6 


(  beta 


beta 


gamma 


gamma 


V 


a  nonredundant  path  with 
no  nodes  labelled  a 


any  a  node.  By  concatenating  the  a  -  c  and  c  -  5  paths 
and  eliminating  any  redundant  subpaths,  we  obtain  a 
nonredundant  path  from  ea  to  eb  containing  no  ra  node. 

Q  K 

This  path  does  not  pass  through  the  edge  (rbeta>  wbeta^’ 

which  therefore  completes  a  nonredundant  cycle  containing 
a  5 

^rbeta’  wbeta^  (see  fi8ure  4.7).  So,  by  the  protocol 

Q  *  mm 

selection  rules,  Rbeta  must  satisfy  P3  with  respect  to  b 
at  beta.  We  now  continue  as  in  subcase  IV  ANTRp2, 
yielding  the  same  P3  violation.  This  completes  case  IV, 
and  the  proof  of  lemma  RA.  Q.  E.  D. 


Lemma _ AW  Let  LOGgiven  be  a  log  defined  over  database 

design  DELTA  and  transaction  set  TAU  such  that  it  is 
well-formed,  satisfies  the  pipelining  rules,  and  satisfies 
the  protocol  selection  rules.  Then  it  is  not  possible 
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Nonredundant  Cycle  for  Case  IV  -  ANTRp^  -  Figure  4.7 


,  a  a  c 

’  "gamma’  wgamma 


e  )  in  Lemma  RA 


beta 


gamma 


a  path  with  no  r  node 


^^nonredundant  path  with 
no  node  labelled  a 


SERIALIZE  (LOG 


given 


,  LOG  . ,  LA,  RA)  returns  FALSE  with 


LA  =  Aa  and  RA  =  Walphg  for  some  a  in  TAU  and  alpha  in 
DELTA. 


Proof  Assume  that  the  lemma  is  false.  Then,  SERIALIZE 
returns  FALSE  and,  by  lemma  S,  there  is  a  blocked 
projection  of  LOGout,  say  LOGblooked,  whose  leftmost  and 
rightmost  atoms  are  Aa  and  Walpha  respectively.  That  is, 

L0Gblocked  ls  of  the  f°™  *■*■■■  “alpha- 


Consider  W 


alpha 


There  are  only  two  kinds  of  atoms  that 


can  be  Wa.  .  's  conflicting  left  neighbor:  either  Wa.  . 

alpha  °  °  alpha 

(which  by  lemma  S  part  II  must  be  contained  in  A  )  where 

transclass( a ' )  =  transclass( a ) ,  or  Rb.  .  (which  may  or 

alpha  J 

may  not  be  contained  in  Ab)  such  that  the  three-way 
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intersection  of 

matzr.-of-class(transclass(b))(transreadset(b))  and 

logical  (transwriteset(a) )  and  stored-data  (alpha)  is 
nonempty . 

Suppose  waipha  is  the  confl icting  neighbor.  Since  W°lpha 

precedes  and  conflicts  with  waipha’  by  lemma  B  waipha 

precedes  Wa  .  in  LOG  ,lf^.  Since  Ea  precedes  and 
aipna  given 

conflicts  with  Ea  in  LOGblocked ,  by  lemma  B  Ea  precedes 


in  LOG  . 

given 


But  since  trar.sclass(  a) 


transclass( a ' ) ,  this  violates  W-W  pipelining.  So,  W 


alpha 


cannot  be  Walp^a's  left  conflicting  neighbor.  Therefore, 


LOGblocked  is  either  of  the  form 


a3‘ • ’ Ralpha  Walpha  or 


Aa  Abr  Rb  1  Wa 

.  .  .a  l  .  •  -«alpha- • • j  "alpha • 


We  now  show  that  trar.sclass(  a)  i  transclassC b) .  Assume 

not.  Since  Eb  follows  Rb,  ._  ir.  LOG  .  ,  Eb  follows  Ea  in 

alpha  out’ 

LOG  .  .  Since  transclass( a)  =  transclassC b)  and  Ea 
out 


precedes  Eb  in  LOG  . ,  by  lemma  B  Ea  precedes  Eb  in 

out 

Since  transclassC  a)  =  transclassC  b)  ,  Rb.  .  and 
given  ’alpha 

“alpha  conflict;  since  R^lpha  precedes  w|lpha  ir. 

L0cblocked-  b 1  lemma  B  "alpha  Precedes  w|lpha  ir.  LOGglvet.. 
This  violates  W-R  pipelining,  a  contradiction.  So, 

transclassC a)  i  transclassC b) . 
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Beginning  at  Raipha»  scan  to  the  left  through  bGGbiocked 
until  the  first  atom  with  a  superscript  of  a"  where 
transclass( a)  =  transclass(a" )  is  found.  Say  this  is  X  . 

o  II  q  a  II 

(Note:  possibly  X  =  A  .)  Now,  beginning  with  X  ,  scan 

to  the  right  in  *-GGblocked  uritil  the  first  atom  with  a 
superscript  of  b'  where  transclass ( b ' )  =  trar.sclass(  b)  is 


found.  Say  this  is  Y 


(Note:  possibly  Yb'  =  Ab  or  Yb' 


=  Rd,  .  .)  Thus,  LOG.,  .  is  either  of  the  form 

alpha  ’  blocked 


Aa...Xa 


Yb '  Rb  ua 
1  *  *  alphawalpha 


Aa  Xa"  Yb '  flbf  Rb  1  Wa 

a  ...a  ...i  ...a  L  •  •  ^aipha'  *  *  J  "alpha  ’ 

Consider  the  left  neighbor  of  Yb ' ,  say  Zc .  (Note: 


possibly  Z 


Xa  . )  By  choice  of  YD  ,  transclass ( b ' )  i 


trar.sclass(  c ) .  In  the  remainder  of  this  proof,  let  a  = 
transclass(a) ,  E  =  transclass( b)  ,  and  c  =  transclass(c) . 
Recall  a  ^  E,  and  c  i  5  by  construction. 


We  now  construct  a  new  log,  ^GGblocked •  b^  excising  from 
h^blocked  those  atoms  (if  any)  separating  Aa  from  Xa  and 
those  atoms  (if  any)  separating  Yb'  from  Ab  (or  Rg^pba)* 

I 

Clearly,  the  resulting  log,  bGGbiocked’  a  blocked 

projection  of  LOG.  .  ,  . .  LOG*  ,  .  is  of  the  form 

blocked  blocked 

flaYa"  7cvb'  b  ua 
**  ^  •  •  •  l  i  n-%  _  w  «»  or 

alpha  alpha 

AaXa". . . ZcYb ' Ab[ . . .Rb  ,  . . . ]Wa  n.  . 

alpha  alpha 
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Claim  AW-path  There  is  a  nonredundant  path  in  CG(D)  from 
a  node  with  a  label  superscripted  such  that  the  path 
passes  through  no  other  node  labelled  E. 

This  claim,  which  follows  directly  from  lemma  P,  will  be 
applied  repeatedly  in  the  remainder  of  the  proof . 

We  analyze  the  ways  in  which  Zc  can  be  in  augmented 

b ' 

conflict  with  Y  and  show  each  possible  conflict  to  be 

impossible.  Since  the  only  assumption  made  so  far  is  that 

the  lemma  is  false,  the  contradiction  that  Zc  does  not 
5  * 

conflict  with  Y  will  prove  the  lemma. 

c  b 1 

Z  can  only  be  in  augmented  conflict  with  Y  due  to  one 

of  NTH’  -  N T K g ,  ANTRp2,  ANTRp2f,  or  ANTRpg .  Since  c  i  5, 

NT  R  *  ,  NTR^ ,  NTR  ^ ,  NTR^ ,  NTR£,  NTR£(i),  and  NTR^(ii) 

cannot  be  the  cause  of  the  conflict.  NTR'  and  NTRA  do 

5  o 

not  apply,  because  no  W  can  appear  in  the  sublog  unless  it 
is  contained  in  an  A,  by  lemma  S,  part  II.  The  remaining 
cases  are  NTR^(ii),  NTR^(iii),  NTR^(iv),  NTRJ(i), 
ANTRp2,  ANTRp2f,  and  ANTRpg-,  They  are  subsumed  by  the 
following  cases: 

I.  ZG  =  Ac;  there  is  a  beta  in  DELTA  such  that  Wp  .  is 

beta 

xr.  Ac  and  either  Yb'  =  R^  or  R^  is  in  Ab’  =  Yb’ 
and  the  three-way  intersection  of 


matzr.-of-class(E)  ( transreadset(  b '  ) ) 


and 
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logical-1  ( trar.swr  iteset(  c) )  and  stored-data- 1  (  beta )  is 
nonempty . 


II.  there  is  a  beta  in  DELTA  such  that  either  ZG  =  RG 

beta 


or  Rbeta  is  in 


Zc ;  Yb '  =  Ab '  and  Wb ' 

’  beta 


and 


the 


three 


way 


is  m 
intersection 


a»’; 


matzr.-of-class(c)  (trar.sreadset(c) ) 


of 

and 


logical”1 ( transwr iteset ( b' ) )  and  stored-data”1 (beta)  is 


nonempty . 


III.  Zc  =  Ac;  Yb  =  Ab  ;  and  the  intersection  of 
trar.swriteset(c)  and  transwr  iteset  (  b  '  )  in  nonempty. 


IV.  Zc  =  Ac  and  Ac-Ab  are  in  augmented  conflict  by 
ANTRp2 »  ANTRp2f-,  or  ANTRp^.  We  analyze  each  of  the  four 
cases  in  detail. 


CASE  I 


(Zl 


Rbeta  13  ir*  Ab 


Ac  contains  WGeta;  either  Yb’  =  R^ta  or 
=  Yb'j  and  Rbeta  and  Wbeta  cor'flict-> 


L0Gblocked  is  of  the  form: 


Aaxa '  ACr  WG  1  Rb*  Rb  ua 

...a  i .  . -wbeta  . .  • J  HbetaRalphaWalpha 


b  '  h  '  h 

where  possibly  Rbeta  is  in  A  and  possibly  Ralpha  is 
A  .  There  are  two  subcases  to  consider:  alpha  =  beta 


in 

and 


alpha  i  beta. 
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Subcase  alpha  =  beta 


From  conflicts  in  LOG^lockea ,  the  edges  (r£eta,  w£eta>  an 
E  a 

^ralpha’  wal£ha^  ex;*-st  in  CG(D).  Since  alpha  =  beta, 

/  b  c  .  _  ,  b  c  \ 

beta’  wbeta'  “  ^ralpha’  walphaJ” 

By  claim  AW-path,  there  is  a  nonredundant  path  from  w^  ^ 
to  w^ioha  does  n°t  Pass  through  any  node  in  class  E 

Nor.redurdant  Cycle  for  Case  I  (alpha  =  beta)  Figure  4.8 
of  Lemma  AW 


beta=alpha 


Wa  lpha1^-' 


beta 


a  nonredundant  path  with 
no  nodes  labelled  5 


So,  we  have  the  nonredundant  cycle  noted  in  figure  4.8. 


If  c  =  a,  then  Walpha  is  not  unique  in  the  log, 
contradiction.  So,  c  i  a. 


Either  c  =  a  or  c  i  a.  Suppose  c  =  a.  Then  Ec  and  E 


conflict  and  by  lemma  B,  Ea  precedes  Ec  in  LOG 


given 
,  ,c 


Since  Walpha  ar'd  Walpha  inflict,  by  lemma  B  Walph 


precedes  W" 


. .  "alpha  in  L0Ggiven  *  1,,AO 

pipelining,  a  contradiction.  Hence  c  /  a. 


This  violates  W- 
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The  cycle  ar.d  the  protocol  selection  rules  imply  R^ta 
must  satisfy  ?2  with  respect  to  a  and  c  at  alpha.  By 


A  N  T  R  p  ^  >  Ea  and  E  are  in  augmented  conflict  and  so,  by 


lemma  B,  Ea  precedes  Ec  in  LOGg^ver 


sir-oe  "alpha  and 


"beta  °°r‘fUot’  bV  lemma  B’  “alpha  Precedes  R*ipha  in 


LOG 


giver. 


Since  beta  =  alpha,  Rbeta  conflicts  with 


Ralpha’  30  by  lemma  B’  Ralpha  Precedes  Ralpha  ir‘  L0Ggiver.- 


Similarly,  Ralpha  precedes  Walpha  in  LOGgiven,  so  by 


transitivity,  Ralpha  precedes  Wglpha  in  LOGgiyeR.  But 

b' 


this  says  that  Ra ^p^g  violates  P 2 .  Contradiction! 


Subcase  alpha  i  beta 


From  conflicts  in  LOGblocked,  the  edges  (rbeta,  wbetg)  and 
b 


^  "alpha  ’  “alpha5  exist  in  CG(DK 


By  definition  of  EDGESvert,  the  edges  (rbeta>  eb)  arid 


(rb  v,  .  eb)  exist  ir.  CG(D) 
alpha ’ 


By  claim  AW-path,  there  is  a  nonredur.dar.t  path  from  wbgta 
.a 


to  w5e^a  that  does  not  pass  through  any  nodes  in  class  5. 


So,  we  have  the  nonredur.dar.t  cycle  noted  in  figure  4.9 


This  cycle  ar.d  the  protocol  selection  rules  imply  that 

Rbl  In  , 

alpha’ 

with  respect  to  a  at  alpha  ar.d  c  at  beta. 


"beta  ar'd  "alpha’  "beta  both  have  t0  satisfy  p2f 


y  Control 
Reproduc i 

Nor.redur.dar.t  Cycle  for  Case  I 

(alpha  i  beta)  of  Lemma  AW 


alpha 


Figure  4.9 


alpha 


beta 


- beta 

a  nonredundant  path  with 
.no  nodes  labelled  5 


We  first  show  that  E  precedes  E  in  LOG 


given 


ANTRp^f ,  E  and  Ec  are  in  augmented  conflict,  so  by  lemma 
B,  Ea  precedes  Ec  ir.  L0Ggiver-  Since  R^eta  conflicts  with 

“beta'  by  lemma  B  "beta  f°Uows  “beta  in  L0Ggiver.'  Slnoe 
Ea  precedes  E° ,  by  R2f  follows  Walpha  ir.  LOGgiven. 

But  since  Ralpha  precedes  Walpha  ir,  LOGgl#en  (by  lemma  B)  , 

"alpha  Preoedes  "beta  lr‘  L0Ggiver.'  Hence,  by 
b  b ' 

pipelining,  E  precedes  E  in  LOG  . 

given 

We  now  need  to  show  Eb  precedes  Eb  in  LOG  .  .  to 

given’ 

establish  a  contradiction.  To  prove  this,  we  first  show 


each  of  the  following  properties  of  LOG 


blocked  * 


i  .  b  i  b  '  ; 


ii.  R 


beta 


is  not  in  A  ; 


iii.  Rb,_,__  is  not  in  Ab 
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there  is  no  A 


in  LOG  .  between  Rr\.„  and 
out  beta 


Rb  ,  with  trar.sclass(  b" )  =  E. 
alpha 


This  sufficiently  restricts  the  form  of  LOGblockec!  so  that 
we  will  be  able  to  obtain  a  contradiction. 


i.  Suppose  b  =  b'.  Then  R^ta  =  Rbeta  and  Ralpha  must 
satisfy  P2f  with  respect  to  c  at  beta  and  a  at  alpha 
(respectively).  By  ANTRp2f,  Ea  and  Ec  are  in  augmented 
conflict,  so  by  lemma  B,  Ea  precedes  Ec  in  LOG  . 

(JiVCli 

Since  ^  conflicts  with  R?.^  and  Rb,_,  conflicts  with 


Since  Wpeta  conflicts  with  Rbata  and  Ralpha  conflicts  with 
Walpha ’  by  lemma  B’  Wbeta  Precedes  Rbetg  and  Ralpha 


precedes  W‘ 


in  LOG  .  .  But  this  violates  P2f, 


7*  -i  ,  X  li  .  vw-hw*  T  - - - - 7 

^  alpha  given 

contradiction!  So,  b  =  b  '  . 

ii.  Suppose  Ralpha  is  in  Ab.  By  part  III  of  lemma  S, 

Rbeta  is  also  in  Since  Rbgta  conflicts  with  Wpeta  and 

Rb  ,  conflicts  with  W3t  .  ,  we  obtain  the  same  P2f 

alpha  alpha’ 

violation  as  (i).  So,  Rb  ,  is  not  in  Ab. 

alpha 

iii.  Rb '  is  not  in  Ab'  by  the  same  argument  as  (ii). 

beta 

b"  b '  b 

iv.  There  is  no  other  A  in  between  Rbeta  and  Rglpha  by 
the  same  argument  as  (ii). 


From  (iv)  and  part  II  of  lemma  S,  we  conclude  that  every 

atom  in  class  5  in  between  Rbeba  and  Ralpha  b^out 

an  R  that  is  not  contained  in  an  A.  Consider  one  such 


"gamma  ir'  this  sublog’  Each  neighbor  of  Rb"mma 

HM  »  -  d 

either  another  R  in  5  or  a  whose 

gamma  gamma 


must  be 
wr iteset 
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intersects  the  readset  of  b"  as  per  NTR^.  By  lemma  S  part 

II,  Wd  must  be  in  Ad  .  Hence,  the  sublog  between  Rb  . 

’  gamma  °  beta 

and  Rglpha  is  of  the  form: 

A°RbetaRbeta- • ^bita^1- ’ ' *Wbeta] * •  * * Wgamma] Rglmma ■ *  * 


Rb4  Af[...Wf 
gamma  gamma 


]**•  etG*  *  * *Ra5pha‘ • *RalphaWalpha* 


Consider  one  pair  of  Rb’s  in  this  sublog  that  have  no  R’s 
in  class  E  ir.  between  them.  For  example,  consider 

Rb4  Af[  .  .  .WG  .  .  .  .  .Ag[  .  .  .Wg  .  .  .  .  ]Rb5lfo. 

gamma  gamma  delta  delta 

We  want  to  show  that  Eb4  precedes  Eb5  in  LOG  iven. 


Suppose  gamma  =  delta.  Since  R  4  conflicts  with  and 

gamma 

b  1  b 

precedes  R  5 _ ,  in  LOG.  by  lemma  b,  R„4 _ ,  precedes 

gamma  blocked’  J  ’  gamma  K 


Rg^mma  ir‘  L0Ggiver. 


Suppose  gamma  i  delta.  By  lemma  P, 


there  is  a  path  in  CG(D)  from  F  =  transclass( f )  to 

g  =  trar.sclass(g)  that  does  not  pass  through  any  node  in 

h  7  h 

E.  From  the  log,  the  edges  (r“ _ .  w„  and  (r:L14.Q, 

gamma’  gamma  delta’ 

w®  u J  are  in  CG(D).  By  definition  of  EDGES,,^,.  ,  the 
delta  _  vert’ 

edees  (rdelta’  el>)  and  (eb’  rgamma)  are  ln  CG(DK  So  •  we 


delta 


have  a  nonredundant  (P2f)  cycle.  Since  R°4  conflicts 

gamma 

with  WR _ and  Rb5  conflicts  with  Wg  .  by  lemma  B, 

gamma  gamma  delta’ 

Rb4  ^  precedes  and  W§  precedes  Rb5  in 

gamma  r  gamma  delta  K  gamma 

LOGgiver.  By  ANTRp2  or  ANTRp2p  (depending  on  whether  or 
not  F  =  g),  Eg  and  E^  are  in  augmented  conflict,  so  by 


lemma  B,  Ef  precedes  Eg  in  LOG  .  .  By  P2f,  since 

giver.  J 


gamma 
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follows  W 


g 


delta  ’ 


then  R 


gamma 


must  follow  W 


Since 


R°4  precedes  W*  m  .  R°4  precedes  R„5mm, 

gamma  y  gamma’  gamma  K  gamma 


gamma 

(by  lemma 


B).  Hence,  by  R-R  pipelining  Eb4  precedes  Eb5. 


b '  b 

Recall  that  the  log  between  R.  .  and  R  ,  .  isofthe 

beta  alpha 


form : 


A  RbetaRb^ta’ * ,RbitaA  [ * * * Wbeta - ’ * ] ’ ’ ‘ *  *  * Wgamma ’ *  * J 


Rb3  ...Rb4  Af[...Wf  .etc. 

gamma  gamma  gamma 


By  R-R  pipelining  Eb  precedes  Eb2  in  LOGg^ver 


By  the 


K  u 

above  argument,  E°2  precedes  E  3,  so  by  the  transitivity 


Eb '  precedes  Eb3.  By  R-R  pipelining  and  transitivity,  Eb' 


precedes  E  4  in  LOG  .  .  By  continuing  the  induction  on 

5  1  Veil 


the  number  of  R's  in  5  in  between  R£eta  and  R^lpha’  we 


have  that  Eb  precedes  Eb  in  LOG  . 

given 


beta  and  Ra] 

This  establishes  a 


contradiction,  thereby  completing  case  1  for  alpha  i  beta 


CaseJI  (either  R°eta  =  2=  or  R°eta  is  in  Z°  =  A=;  H^ta 


is  in  A“  =Yb';  W“eta  and  R“eta  conflict) 


LOG 


blocked 


is 


of 

.a 


the 


form 
c 


AaXa  '  . .  .  R^etaAb '  t^oeta^alpha^alpha '  “hera  P0”1*1*  Seta 


is  in  A  . 


From  conflicts  in  LOC^locked,  the  edges  (r*eta,  w£eta>  ar.d 


<rb. 


w  - 


_)  are  in  CG( D)  . 
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By  claim  AW-path,  there  is  a  nonredundant  path  from  wa,  . 
_  alph 

to  r^  ,  that  does  not  pass  through  any  node  in  class  5 


So,  we  have  the  nonredundant  cycle  noted  in  figure  4.10 


The  cycle  and  the  protocol  selection  rules  imply  R 


alph 


must  satisfy  P3  with  respect  to  I  at  alpha.  By  ANTRp3>  E 
is  in  augmented  conflict  with  Eb.  Since  Eb  follows  Ea  i 


L0Cblocked  (beoause  "alpha  follous  E“  in  L0Gblocked>-  b 


lemma  B,  Ea  precedes  Eb  in  LOG  .  Since  Rw,  . 

given  alph 

oor.flicts  with  Walpha,  by  lemma  b,  Rblpha  precedes  Walph 
if*  L0GgiVen-  But  this  means  that  Rglpha  violates  P3  wit 
respect  to  a  at  alpha.  Concradiction ! 

Case _ III  (Zc  =  A°;  Yb  =  Ab  ;  the  intersection  o 

transwr iteset(c)  and  transwriteset( b ' )  is  nonempty) 


L0Gblocked  is  of  the  form 
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AaXa'...AcAb’Rb,  „  Wa,  „ 
alpha  alpha 


From  conflicts  in  LOG 


blocked 


,  the  edges  (ec,  eb)  and 


B  a 

^ralpha’  walpha^  are  in  By  definition  of  EDGES 

(eb’  ralpha)  is  in  CC(D)- 


vert 


By  claim  AW-path,  there  is  a  nonredundant  path  from  wa,  . 

alpha 

to  ec  that  does  not  pass  through  any  nodes  in  class  B. 


Nonredundant  Cycle  for  Case  III  of  Lemma  AW  Figure  4.11 


alpha 


alpha 


a  nonredundant  path  with 
no  nodes  labelled  B 


So,  we  have  the  nonredundant  cycle  noted  in  figure  4.11. 


The  cycle  and  the  protocol  selection  rules  imply  Rgipha 
must  satisfy  P3  with  respect  to  a  at  alpha.  The  remainder 
of  the  argument  is  identical  to  case  II. 


Case  IV  (ZG  =  Ac  and  Ac-Ab  are  in  augmented  conflict  by 

ANTRp2,  ANTRp2f,  or  ANTRp^).  LGGblocked  *-s  the  form 

AaXa’...ZcYb’RbetaWalpha  where  possibly  R°eta  is  in  Ab. 


SDD-1  Concurrency  Control  Mechanism 
Proof  of  Serial  Reproducibility 


Page  -153- 
Section  4 


E  a 

From  the  log,  the  edge  (r  .  .  ,  w  ,  .  )  is  in  CG(D).  From 

alpha’  alpha 

claim  AW-path  and  the  sublog  Xa  ...Zc,  there  is  a 
nonredundant  path  in  CG(D)  from  a  node  in  c  to  a  node  in  a 
that  does  not  pass  through  any  node  in  5.  There  are  now 
three  subcases  to  consider  for  each  of  ANTRp2,  ANTRp2f, 
and  ANTRp^  —  the  only  ways  that  Ac-Ab  can  be  in 
conflict . 

Subcase  IV  -  ANTRp2 


By  ANTRp2,  there  is  a  class,  3,  and  a  data  module,  beta, 
such  that  there  is  a  nonredundant  cycle  in  CG(D)  with  the 
subpath  (w®eta,  r£eta,  wj|eta).  We  want  to  show  a  subpath 


,  b  b  b  a  v 

lwbeta’  e  ’  ralpha ’  "alpha' 

definition  of  EDGES  .  , 

vert  ’ 


in  a  cycle  in  CG(D).  By 
the  edges  (w£eta>  eb)  and  (eb, 


r.,  .  )  are  in  CG(D).  If  3  i  a,  then  the  subpath  (w°  .  , 

alpha  _  ’  beta 

rbeta’  wbeta^  and  the  nonreduRdant  Path  from  c  to  a  that 


does  not  pass  through  5  are  sufficient  to  complete  the 


cycle  (see  figure  4.12).  If  3  =  I,  then  the  edge  (rbeta> 

"beta’  and  the  edges  ( rbeta  ’  «*>  and  <** '  “alpha’  fronl 

EDGES  .  are  sufficient  to  complete  the  cycle  (see  figure 
vert 

4.12).  So,  we  have  a  nonredundant  cycle  in  CG(D)  with  the 
subpath  (wdeta,  eb ,  rblpha,  walpha>.  Hence,  by  the 
protocol  selection  rules,  raipha  rauSt  satisf,y  p3  with 
respect  to  a  at  alpha.  However  P3  is  violated  by  Rg^p^a 
in  the  following  way:  By  ANTRp^  and  the  cycle,  Ea  and  Eb 
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Nonredundant  Cycles  for  Subcase  IV  —  Figure  4.12 

ANTRp2  of  Lemma  AW 


3  b 

are  in  augmented  conflict.  So,  since  E  precedes  Rgipba 
which  precedes  Eb  in  EOGblocked ,  lemma  B*  Ea  precedes 
Eb  in  L0Ggiven-  Since  Ralpha  conflicts  with  Walpha, 

“alpha  preoades  “alpha  ir‘  L00given-  This  violates  P3'  a 
contradiction . 

Subcase  IV  -  ANTRp2f. 

By  ANTRp2f,  there  is  a  class,  3,  and  two  distinct  data 
modules,  beta  and  gamma,  such  that  there  is  a  cycle  in 
CG(D)  with  a  subpath 

,  c  3  3  3  E  , 

wbeta’  rbeta’  e  ’  "gamma’  wgamma;* 

We  car.  proceed  exactly  as  in  Subcase  IV  -  ANTRp2  yielding 
the  same  P3  violation  (see  figure  4.13). 


I 
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Nor.redur.dant  Cycles  for  Subcase  IV  —  Figure  4.13 

ANTRp^  of  Lemma  AW 


r5  r3  3 

"alpha  gamma  rbeta 


Subcase  IV  -  ANTR 


n 


By  ANTRp^ ,  there  is  a  data  module  beta,  such  that  there  is 
a  cycle  in  CG(D)  with  either  the  subpath 


/  c  c  b  bv 

(  ’  wbeta ’  beta’  e  } 


or 


,  c  c  b  bx 

(e  •  beta’  wbeta  ’  e 


We  treat  each  subpath  as  a  separate  case. 

Subcase  IV  -  AKTRp^  -  (e5,  r°eta,  wggta,  eE) _ 

We  can  deduce  that  the  edge  (rjjeta,  w3eta)  is  in  CG(D), 
and  by  definition  of  EDGESvert  (w||eta»  e3)  and  (e3, 
are  in  CG(D)  (see  figure  4.14a).  As  in  subcase  IV 


Page  -156-  SDD-1  Concurrency  Control  Mechanism 

Section  4  Proof  of  Serial  Reproducibility 


Nor.redundar.t  Cycles  for  Subcase  IV  —  Figure  4.14 

ANTRp^  of  Lemma  AW 


ralpha  rbeta 


(a) 


P  V  p  w 

alpha  Deta 


(b) 


-  ANTFp2  >  Ralpha  must  satisfy  P3,  but  violates  P3  in 

LOG  .  ,  a  contradiction, 

given  ’ 
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Subcase  IV  -  ANTRp3  -  (ec,  wgeta,  r°eta,  eb) 


By  these  augmented  edges  and  ANTRp^,  there  is  a  path  from 

eb  to  ec  that  does  not  pass  through  rgamma  (including 

rb.  .  ).  This  completes  the  cycle  (see  figure  4.14b)  and 

alpha 

we  can  proceed  to  a  P3  violation  as  in  Subcase  IV 


ANTR 


P2  ’ 


Q.  E.  D. 
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5.  Protocol  P4,  A  Cycle-Breaking  Protocol 


5.1  Motivation  for  a  Cycle-Breaking  Protocol 


From  a  logical  standpoint,  {PI,  P2,  P2f,  P 3 }  are  a 
sufficient  set  of  mechanisms  to  correctly  execute  all 
transactions  in  all  classes.  That  is,  with  these  protocol 
schemas  alone,  serial  reproducibility  can  be  guaranteed. 
However,  from  an  efficiency  standpoint,  these  protocol 
schemas  have  a  serious  problem.  The  problem  is  that  a 
single  class  car.  cause  cycles  in  the  conflict  graph  and 
thereby  force  many  classes  to  run  expensive  protocols, 
even  tho ugh  very  f ew  trar.sa ctions  are  ever  run  in  that 
class . 


While  we  expect  that 
that  we  wish  to  execute 
predefined  classes,  we 
unexpected  transaction 
class  definitions.  One 
define  a  very  "large 


the  vast  majority  of  transactions 
are  predictable  and  belong  to 
still  want  to  be  able  to  execute  an 
that  does  not  fit  into  any  of  our 
way  to  accomplish  this  is  to 
"  class,  call  it  that  has  a 
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read- set  and  write-set  that  includes  the  entire  logical 
database.  Every  conceivable  transaction  can  fit  into 
C,  ,  ,,  so  this  apparently  solves  the  problem.  But  the 
cost  is  enormous,  for  ^total  induces  a  two-class  cycle 
with  every  other  class  in  the  system.  So,  every  class  has 
to  run  P3  against  ctota^i  and  ^as  t0  run  P3  against 
every  other  class.  Since  P3  is  the  most  expensive 
protocol  schema,  this  is  an  unfortunate  state  of  affairs. 
It  is  especially  unfortunate  because  transactions  will 
rarely  need  to  execute  in  Ct0£a]_ ,  since  most  transactions 
fit  into  other  less  expensive  classes.  So,  C,  ,  , 
introduces  considerable  synchronization  overhead  for 
synchronizing  against  a  class  that  will  rarely  run  a 
transaction . 


In  general,  any  class  in  which  transactions  are  only 

infrequently  run,  but  which  creates  many  cycles  in  the 

conflict  graph,  exhibits  this  phenomenon.  Clearly,  the 

problem  of  proliferation  of  cycles  is  especially  acute  in 

C,  ,  , .  However,  other  classes  with  smaller  read-sets  and 
total  ’ 
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behave  as  if  the  cycle  did  not  exist  (and,  therefore,  run 
PI  with  respect  to  that  cycle).  In  other  words,  the 
protocol  selection  rules  only  apply  to  cycles  that  do  not 
contain  a  class  that  runs  PH. 

That  we  need  a  PH  cycle-breaking  protocol  is  clear.  In 
the  remainder  of  this  section,  we  discuss  how  such  a 
protocol  can  be  implemented. 


5.2  Overview  of  PH 

One  way  to  implement  PH  is  to  shut  off  the  system  so  that 
no  new  transactions  can  be  introduced.  After  all 
outstanding  WRITE  messages  have  been  processed,  then  the 
system  has  quiesced.  Assuming  every  class  was  running  the 
correct  protocol,  the  log  (up  to  this  point)  should  be 
serially  reproducible.  Now,  we  run  the  PH  transaction. 
After  all  of  this  transaction's  WRITE  messages  arrive  and 
are  processed,  it  is  safe  to  start  up  the  system  again, 
allowing  r.ew  transactions  to  be  run.  What  we  have  done  is 
turn  off  the  system,  wait  until  a  serially  reproducible 
database  state  is  reached,  run  the  PH  to  completion,  and 
then  start  up  the  system  again.  The  PH  transaction 
partitions  the  log  in  half,  and  each  half  is  serially 
reproducible  (since  the  other  transactions  are  running  the 
correct  protocols). 
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The  degradation  of  performance  that  results  from  shutting 
off  the  system,  even  temporarily,  is  likely  to  be  severe. 
So,  the  above  P4  algorithm  is  unacceptable.  To  weaken  it, 
we  observe  that  the  P4  need  only  synchronize  against  other 
classes  that  lie  on  the  cycle  including  the  P4  class, 
since  only  classes  on  cycles  can  cause  non-serially 
reproducible  logs.  Also,  we  note  that  even  these  classes 
need  not  quiesce  completely  before  the  P4  runs.  All  that 
we  need  is  the  weaker  condition  that  the  log  be  equivalent 
to  some  log  in  which  all  of  the  classes  have  quiesced 
before  the  P4.  With  these  observations  in  mind,  a  much 
weaker  P4  can  be  derived. 


5-3  Implementation  of  P4 


Protocol  schema  P4  differs  structurally  from  the  other 
protocol  schemas  in  two  ways:  First,  P4  requires  some 
direct  communication  between  transaction  modules.  By  this 
commun: cation ,  the  P4  class  requests  that  certain  other 
transaction  modules  perform  synchronization  to  avoid 
conflicting  with  the  P4  transaction.  Second,  P4  requires 
an  augmented  form  of  read  condition.  Recall  that  a 
standard  read  condition  is  a  pair  <timestamp,  {classes}>. 
For  P4,  the  timestamp  may  be  interpreted  as  a  "minimum 
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time",  i.e.,  <mintime=timestamp ,  {classes}>.  This 

condition  is  satisfied  if  all  WRITE  messages  from 
{classes}  timestamped  less  than  "timestamp"  have  been 
received.  It  does  not  require  that  no  messages  from 
classes  timestamped  greater  than  "timestamp"  be  received 
(as  in  standard  read  conditions). 

To  implement  P4 ,  we  use  three  additional  types  of  messages 
that  are  sent  from  TM ' s  to  TM ' s  ( not  from  TM  ’  s  to  DM's). 
A  P4-ALERT  message  is  sent  from  a  P4  class  to  some  other 
class.  A  P4-ALERT  message  includes  the  P4  class's  name 
and  timestamp  as  its  parameters.  A  class  responds  to  a 
P4-ALERT  with  either  a  P4-ACC  (i.e.,  an  acceptance)  or  a 
P4-REJ  (i.e.,  a  rejection). 

To  run  a  transaction  tp^  in  the  P4  class  cplj,  one  performs 
the  following  steps: 

1.  Choose  a  timestamp  for  tpi<,  say  TSplj. 

2.  Send  a  message  P4-ALERT  ( TSpi( )  to  every  class  that 
lies  on  the  cycle  in  CG(D). 

3.  Wait  for  the  P4-ACC ' s  to  be  received  from  all 
classes  to  which  a  P4-ALERT  was  sent.  If  a  P4-REJ 
is  received,  then  restart  the  protocol  from  step  1. 
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M.  Construct  the  READ  messages  for  tp^.  Por  each  data 
module,  alpha,  to  which  a  read  message  will  be 
sent,  include  a  condition  <TSpl<,C^>  for  each  class 
Ct  such  that  the  edge  w°jpha)  lies  on  the 

cycle . 

When  a  transaction  module  receives  a  PM-ALERT(tp4 ,  TSp^) 
for  a  particular  class,  c^,  it  performs  the  following 
steps : 

1.  If  the  class  has  run  or  begun  running  a  transaction 
with  a  timestamp  greater  than  TSp^,  then  respond  to 
cp4  by  sending  P4-REJ .  Otherwise,  send  P4-ACC  and 
do  not  run  another  transaction  in  ci  timestamped 
earlier  than  TSp4. 

2.  For  the  next  transaction  run  in  c^,  for  each 

datamodule  alpha  and  each  class  c  .  such  that  edge 

J 

(raipha’  "aipha5  lies  on  a  cycle  with  CPV  include 
the  condition  <mintime=TSp4 ,  Cj>,  in  the  READ 

message  to  DM  ,  ,  .  These  conditions  are  in 

alpha 

addition  to  those  normally  included  by  c^  in  its 
read  messages. 
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It  should  be  emphasized  that  (2)  is  only  performed  for  the 
first  transaction  executed  in  C^  with  timestamp  greater 
than  TSp^ .  Later  transactions  in  c^  can  run  PI  again, 
with  respect  to  this  P4  cycle. 

5.4  Proof  of  Correctness  for  Protocol  P4 

A  proof  of  serial  reproducibility  incorporating  protocol 
P4  has  been  developed  and  will  appear  in  a  later  Technical 
Report . 
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A.  Update  Semantics  and  Fragment  Definition 


A . 1  Insertion  /  Deletion  Semantics 

The  basic  update  operation  in  SDD-1  is  a  WRITE  message 
that  changes  the  value  of  existing  data  items  (see  Section 
2).  To  enable  insertions  and  deletions  using  this  write 
message  format,  we  augment  each  relation  by  a  special 
boolean  domain  named  "Existence-bit"  (abbr.  E-bit).  From 
a  logical  viewpoint,  every  TID  value  is  "present"  in  the 
sense  that  it  can  be  referenced.  We  distinguish  between 
TIDs  that  label  real  tuples  and  those  that  label  an  empty 
slot  for  a  tuple  by  the  E-bit:  If  E-bit=1  then  the  tuple 
exists  in  the  relation;  otherwise,  the  tuple  does  not 
exist . 

Using  this  model,  we  define  four  operation  on  relations: 
RETRIEVE,  DELETE,  INSERT,  and  CONDITIONAL  INSERT.  These 
are  the  kinds  of  operations  that  we  expect  users  will  want 
to  perform  on  SDD-1  relations,  and  they  essentially 
correspond  to  standard  query  language  commands.  RETRIEVE 
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selects  a  portion  of  a  relation  to  be  read;  it  only  reads 
tuples  with  the  E-bit  =  1.  DELETE  simply  sets  E-bit  =  0 
for  the  tuples  to  be  deleted.  INSERT  sets  E-bit  =  1  for 
the  TID  values  for  tuples  to  be  inserted.  CONDITIONAL 
INSERT  inserts  TIDs  provided  they  do  not  already  exist,  by 
checking  that  E-bit  =  0  before  setting  E-bit  =  1.  This 
latter  operation  may  be  needed  to  avoid  overwriting 
already  existing  tuples. 

The  E-bit  domain  must  be  used  in  determining  the  read-set 
and  write-set  for  a  class  of  transactions.  Insert  and 
delete  operations  are  in  conflict  precisely  insofar  as 
they  both  use  the  E-bit  domain,  and  this  conflict  may 
require  adding  some  edges  to  the  conflict  graph. 


A. 2  Fragment  Updates 


Recall  the  definition  of  logical  fragments.  First 
partition  the  relation  according  to  a  set  of  restrictions 
and  then  define  each  logical  fragment  to  be  a  projection 
of  a  partition  on  the  TID  domain  and  one  other  domain. 

That  fragments  are  defined  logically  creates  certain 
problems  on  updates.  If  the  restriction  qualification 
that  defines  a  fragment  uses  domain  D,  say,  then  updates 
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to  a  D-value  may  cause  a  tuple  to  "migrate"  from  one 
partition  (and  hence  fragment)  to  another.  For  example, 
if  the  EMPLOYEE  relation  is  partitioned  based  on  the 
DEPARTMENT  domain,  then  moving  an  employee  to  a  new 
department  causes  a  tuple  migration  to  a  different 
partition.  Since  fragments  in  different  partitions  are 
stored  as  independant  files,  often  at  different  data 
modules,  the  tuple  migration  requires  WRITE  messages  to 
delete  the  tuple  from  one  fragment  and  add  it  to  another. 
When  determining  the  read-sets  and  write-sets  of  a 
transaction  class,  potential  tuple  migrations  must  be 
considered,  since  additional  WRITE  messages  may  be 
required  to  maintain  the  consistency  of  the  fragment 
within  its  definition. 
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